GoodocLanguageLabel

AI Overview😉

  • The potential purpose of this module is to identify and label the language of content within a webpage or document, allowing Google to better understand and categorize the content for search results.
  • This module could impact search results by influencing the ranking of pages based on their language, allowing Google to prioritize content that is most relevant to the user's language preferences. It may also help Google to filter out content that is not relevant to the user's language, improving the overall quality of search results.
  • To be more favorable for this function, a website could ensure that its content is properly labeled with language codes, using standardized formats such as ISO 639 and BCP 47. This could include using language-specific URLs, meta tags, and content headers to clearly indicate the language of the content. Additionally, websites could ensure that their content is easily machine-readable, allowing Google's algorithms to accurately identify and categorize the language of the content.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.GoodocLanguageLabel (google_api_content_warehouse v0.4.0)

Language label

Attributes

  • ClosestLanguageId (type: integer(), default: nil) - Closest id from i18n/languages/proto/languages.proto; caveat: may not accurately capture the language. GoodocLanguageCodeToLanguage() declared in ocr/goodoc/goodoc-utils.h may be used to convert a Language enum (i18n/languages/proto/languages.proto) to a string suitable for this field.
  • Confidence (type: integer(), default: nil) - Confidence level on that language, between 0 and 100
  • LanguageCode (type: String.t, default: nil) - Old (Ocean) Language Code Usage: The language code is inferred during the running of the Garbage Text Detector and gets set at the paragraph, block and page level. Language code is a string of 3 or more characters. The first 3 letters specify the language, according to ISO 639. Optionally, the 3-letter code can be extended with an underscore and a language variant specifier. Specifiers exist for regional variants or for different forms of language spelling. The regional variants are specified as 2-letter country code, according to ISO 3166. Some examples: Standard "por" - Portuguese, standard "rus" - Russian, standard Regional variants: "por_br" - Portuguese, Brazilian "eng_us" - English, United States Variants of spelling: "rus_old" - Russian, old spelling "chi_tra" - Chinese, traditional "ger_new" - German, new spelling LanguageToGoodocLanguageCode() declared in ocr/goodoc/goodoc-utils.h may be used to convert a Language enum (i18n/languages/proto/languages.proto) to a string suitable for this field. New Language Code Usage: Most of the usages described above were standardized in BCP 47, and these codes are the new stanadard to be used in this field. To load either new or old language codes to form LanguageCode objects, use the function FromOceanCode() in ocr/quality/lang_util.h Note that the function ocr::FromOceanCode is capable of transforming either version of the LanguageCode to a C++ i18n_identifiers::LanguageCode.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.GoodocLanguageLabel{
  ClosestLanguageId: integer() | nil,
  Confidence: integer() | nil,
  LanguageCode: String.t() | nil
}

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.