GoogleCloudContentwarehouseV1GcsIngestWithDocAiProcessorsPipeline

AI Overview😉

  • Potential purpose of module: This module appears to be responsible for configuring and processing documents uploaded to Google Cloud Storage using DocAI (Document AI) processors. It extracts relevant information from the documents, classifies them, and stores the results in a Cloud Storage folder.
  • Impact on search results: The output of this module could influence search results by providing more accurate and relevant information about documents stored in Cloud Storage. This could lead to improved search rankings for documents that are accurately classified and extracted, making them more discoverable by users.
  • Optimization for this function: To be more favorable for this function, a website could ensure that their documents uploaded to Cloud Storage are well-structured, contain relevant metadata, and are easily accessible. Additionally, they could provide clear and concise classification information to help the DocAI processors accurately extract and classify the documents.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1GcsIngestWithDocAiProcessorsPipeline (google_api_content_warehouse v0.4.0)

The configuration of the Cloud Storage Ingestion with DocAI Processors pipeline.

Attributes

  • extractProcessorInfos (type: list(GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ProcessorInfo.t), default: nil) - The extract processors information. One matched extract processor will be used to process documents based on the classify processor result. If no classify processor is specified, the first extract processor will be used.
  • inputPath (type: String.t, default: nil) - The input Cloud Storage folder. All files under this folder will be imported to Document Warehouse. Format: gs:///.
  • pipelineConfig (type: GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1IngestPipelineConfig.t, default: nil) - Optional. The config for the Cloud Storage Ingestion with DocAI Processors pipeline. It provides additional customization options to run the pipeline and can be skipped if it is not applicable.
  • processorResultsFolderPath (type: String.t, default: nil) - The Cloud Storage folder path used to store the raw results from processors. Format: gs:///.
  • skipIngestedDocuments (type: boolean(), default: nil) - The flag whether to skip ingested documents. If it is set to true, documents in Cloud Storage contains key "status" with value "status=ingested" in custom metadata will be skipped to ingest.
  • splitClassifyProcessorInfo (type: GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ProcessorInfo.t, default: nil) - The split and classify processor information. The split and classify result will be used to find a matched extract processor.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1GcsIngestWithDocAiProcessorsPipeline{
    extractProcessorInfos:
      [
        GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ProcessorInfo.t()
      ]
      | nil,
    inputPath: String.t() | nil,
    pipelineConfig:
      GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1IngestPipelineConfig.t()
      | nil,
    processorResultsFolderPath: String.t() | nil,
    skipIngestedDocuments: boolean() | nil,
    splitClassifyProcessorInfo:
      GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ProcessorInfo.t()
      | nil
  }

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.