ResearchScienceSearchSourceUrlDocjoinInfo

AI Overview😉

  • The potential purpose of this module is to extract and store various metadata about a webpage, such as its URL, title, language, and content classification, as well as its relevance and importance metrics like PageRank and salient terms. This metadata is likely used to inform the search ranking algorithm and provide a better search experience for users.
  • This module could impact search results by influencing the ranking of webpages based on their metadata. For example, a webpage with a high PageRank or relevance score may be ranked higher in search results, while a webpage with low-quality or irrelevant content may be ranked lower. The metadata extracted by this module could also be used to filter out or demote webpages that don't meet certain quality or relevance standards.
  • A website may improve its favorability with this module by ensuring that its webpages have accurate and descriptive metadata, such as titles, descriptions, and keywords. Additionally, creating high-quality, relevant, and informative content can improve a webpage's relevance score and increase its chances of being ranked higher in search results. Furthermore, ensuring that a webpage's language and content classification are accurate can also improve its visibility and ranking in search results.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchSourceUrlDocjoinInfo (google_api_content_warehouse v0.4.0)

The proto containing all the information we extracted from docjoin, for the source_url of the dataset. NEXT TAG: 18

Attributes

  • dataSource (type: String.t, default: nil) -
  • displayUrl (type: String.t, default: nil) - The url used to display in the google search results.
  • docid (type: String.t, default: nil) - The docid of the document.
  • indexTier (type: list(String.t), default: nil) - Index tiers (BASE, UNIFIED_ZEPPELIN, etc) that the document belongs to. NOTE: Each document may belong to multiple tiers. NOTE: The original data type is an enum CompositeDoc::SubIndexType. However we don't want to depend on segindexer/compositedoc.proto because the proto is too large. Instead, we use CompositeDoc::SubIndexType_Name( subindexid) to convert into a string representation. To convert string back to CompositeDoc::SubIndexType, use CompositeDoc::SubIndexType_Parse.
  • languageCode (type: String.t, default: nil) - The language of the document in the string representation of LanguageCode. Converts from Language Enum to LanguageCode through i18n/identifiers/langenclanguagecodeconverter.h Please use i18n/identifiers/languagecodeconverter.h for converting between LanguageCode and string representation.
  • latestPageUpdateDate (type: String.t, default: nil) - The syntactic date of a dataset document that reflects the publication date of the content.
  • navboostQuery (type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchNavboostQueryInfo.t), default: nil) - A sequence of Navboost queries for the dataset source_url.
  • pagerank (type: integer(), default: nil) - The page rank of the document. DEPRECATED in favour of Pagerank_NS. Do not use as it is no longer maintained in docjoins and can break at any moment.
  • pagerankNs (type: integer(), default: nil) - The production pagerank value of the document.
  • petacatInfo (type: GoogleApi.ContentWarehouse.V1.Model.FatcatCompactDocClassification.t, default: nil) - Petacat classifications for the web document. Normally the results from calling Petacat come in a PetacatResponse, which is very flexible and extensible. This proto takes most of the flexibility away - only rephil clusters, taxonomic classifications, and binary classifications, with discretized weights.
  • salientTerms (type: GoogleApi.ContentWarehouse.V1.Model.QualitySalientTermsSalientTermSet.t, default: nil) - A set of salient terms extracted fromthe document. DEPRECATEAD. Moved to DatasetMetadata for performance reasons.
  • scholarInfo (type: GoogleApi.ContentWarehouse.V1.Model.ScienceIndexSignal.t, default: nil) - Science per-doc data for inclusion in websearch.
  • sporeGraphMid (type: list(String.t), default: nil) - A set of entities from WebRef annotations that are in SPORE_GRAPH.
  • title (type: String.t, default: nil) - The title of the document.
  • topEntity (type: list(GoogleApi.ContentWarehouse.V1.Model.RepositoryWebrefWebrefEntity.t), default: nil) - A set of top entities from WebrefAnnotation, top is defined by topicality score, see go/topicality-score for detail. DEPRECATED. See label_to_mids_map instead.
  • url (type: String.t, default: nil) - The url of the document.
  • webrefEntity (type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchSourceUrlDocjoinInfoWebrefEntityInfo.t), default: nil) - A set of entities copied from WebRefEntities on cDoc.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchSourceUrlDocjoinInfo{
    dataSource: String.t() | nil,
    displayUrl: String.t() | nil,
    docid: String.t() | nil,
    indexTier: [String.t()] | nil,
    languageCode: String.t() | nil,
    latestPageUpdateDate: String.t() | nil,
    navboostQuery:
      [
        GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchNavboostQueryInfo.t()
      ]
      | nil,
    pagerank: integer() | nil,
    pagerankNs: integer() | nil,
    petacatInfo:
      GoogleApi.ContentWarehouse.V1.Model.FatcatCompactDocClassification.t()
      | nil,
    salientTerms:
      GoogleApi.ContentWarehouse.V1.Model.QualitySalientTermsSalientTermSet.t()
      | nil,
    scholarInfo:
      GoogleApi.ContentWarehouse.V1.Model.ScienceIndexSignal.t() | nil,
    sporeGraphMid: [String.t()] | nil,
    title: String.t() | nil,
    topEntity:
      [GoogleApi.ContentWarehouse.V1.Model.RepositoryWebrefWebrefEntity.t()]
      | nil,
    url: String.t() | nil,
    webrefEntity:
      [
        GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchSourceUrlDocjoinInfoWebrefEntityInfo.t()
      ]
      | nil
  }

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.