GoogleCloudContentwarehouseV1ExportToCdwPipeline

AI Overview😉

  • The potential purpose of this module is to configure the export of documents from Google's Document Warehouse to Cloud Data Warehouse (CDW) pipeline, which is a process of transferring and preparing data for analysis and machine learning tasks. This module seems to be responsible for setting up the export process, including specifying the CDW dataset, selecting the documents to be exported, and defining the storage location and data split ratio.
  • This module could impact search results by influencing the quality and relevance of the data used for training and testing machine learning models, such as those used in Google's search algorithm. By configuring the export process correctly, the module can ensure that the data is accurate, complete, and well-structured, which can lead to better search results. On the other hand, if the export process is not set up correctly, it could result in poor-quality data, which can negatively impact search results.
  • To be more favorable for this function, a website may need to ensure that its documents are well-structured and accurately labeled, making it easier for the export process to select and prepare the relevant data. Additionally, the website may need to provide clear and consistent metadata, such as document IDs and resource names, to facilitate the export process. Furthermore, the website may need to optimize its data storage and organization to make it easier for the export process to access and transfer the data to CDW.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ExportToCdwPipeline (google_api_content_warehouse v0.4.0)

The configuration of exporting documents from the Document Warehouse to CDW pipeline.

Attributes

  • docAiDataset (type: String.t, default: nil) - Optional. The CDW dataset resource name. This field is optional. If not set, the documents will be exported to Cloud Storage only. Format: projects/{project}/locations/{location}/processors/{processor}/dataset
  • documents (type: list(String.t), default: nil) - The list of all the resource names of the documents to be processed. Format: projects/{project_number}/locations/{location}/documents/{document_id}.
  • exportFolderPath (type: String.t, default: nil) - The Cloud Storage folder path used to store the exported documents before being sent to CDW. Format: gs:///.
  • trainingSplitRatio (type: number(), default: nil) - Ratio of training dataset split. When importing into Document AI Workbench, documents will be automatically split into training and test split category with the specified ratio. This field is required if doc_ai_dataset is set.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.GoogleCloudContentwarehouseV1ExportToCdwPipeline{
    docAiDataset: String.t() | nil,
    documents: [String.t()] | nil,
    exportFolderPath: String.t() | nil,
    trainingSplitRatio: number() | nil
  }

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.