GoogleCloudContentwarehouseV1IngestPipelineConfig

AI Overview😉

  • The potential purpose of this module is to configure the ingestion pipeline for Document AI Warehouse, which involves setting up Cloud Functions, access controls, and document processing settings. This module seems to be responsible for defining how documents are ingested, processed, and stored in the warehouse.
  • This module could impact search results by influencing how documents are indexed, stored, and accessed. The access control settings, for example, could affect which documents are visible to certain users or groups. The document text extraction setting could also impact the quality of search results by determining whether the full text of documents is available for search.
  • To be more favorable for this function, a website could ensure that its documents are properly formatted and structured to take advantage of the document text extraction feature. Additionally, the website could implement clear access control policies and assign relevant roles to users or groups to ensure that documents are properly secured. Furthermore, the website could optimize its Cloud Function to complete within the 5-minute timeout to prevent ingestion failures.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1IngestPipelineConfig (google_api_content_warehouse v0.4.0)

The ingestion pipeline config.

Attributes

  • cloudFunction (type: String.t, default: nil) - The Cloud Function resource name. The Cloud Function needs to live inside consumer project and is accessible to Document AI Warehouse P4SA. Only Cloud Functions V2 is supported. Cloud function execution should complete within 5 minutes or this file ingestion may fail due to timeout. Format: https://{region}-{project_id}.cloudfunctions.net/{cloud_function} The following keys are available the request json payload. display_name properties plain_text reference_id document_schema_name raw_document_path raw_document_file_type The following keys from the cloud function json response payload will be ingested to the Document AI Warehouse as part of Document proto content and/or related information. The original values will be overridden if any key is present in the response. display_name properties plain_text document_acl_policy folder
  • documentAclPolicy (type: GoogleApi.ContentWarehouse.V1.Model.GoogleIamV1Policy.t, default: nil) - The document level acl policy config. This refers to an Identity and Access (IAM) policy, which specifies access controls for all documents ingested by the pipeline. The role and members under the policy needs to be specified. The following roles are supported for document level acl control: roles/contentwarehouse.documentAdmin roles/contentwarehouse.documentEditor roles/contentwarehouse.documentViewer The following members are supported for document level acl control: user:[email protected] * group:[email protected] Note that for documents searched with LLM, only single level user or group acl check is supported.
  • enableDocumentTextExtraction (type: boolean(), default: nil) - The document text extraction enabled flag. If the flag is set to true, DWH will perform text extraction on the raw document.
  • folder (type: String.t, default: nil) - Optional. The name of the folder to which all ingested documents will be linked during ingestion process. Format is projects/{project}/locations/{location}/documents/{folder_id}

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1IngestPipelineConfig{
    cloudFunction: String.t() | nil,
    documentAclPolicy:
      GoogleApi.ContentWarehouse.V1.Model.GoogleIamV1Policy.t() | nil,
    enableDocumentTextExtraction: boolean() | nil,
    folder: String.t() | nil
  }

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.