IndexingDocjoinerAnchorStatistics

AI Overview😉

  • The potential purpose of this module is to analyze and process anchor text data from web pages to determine the quality and relevance of links pointing to a particular webpage. This includes identifying spammy or low-quality links, counting the number of unique anchor phrases, and tracking the number of domains and pages linking to a site.
  • This module could impact search results by influencing the ranking of web pages based on the quality and relevance of their incoming links. Web pages with high-quality, relevant links may be ranked higher, while those with low-quality or spammy links may be penalized. This could lead to more accurate and relevant search results for users.
  • To be more favorable for this function, a website could focus on acquiring high-quality, relevant links from trusted sources. This could include creating high-quality, informative content that attracts links from other reputable websites, engaging in link building strategies such as guest blogging or broken link building, and avoiding spammy or manipulative link building tactics. Additionally, websites could focus on optimizing their internal linking structure to make it easier for users and search engines to navigate their site.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 1

GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics (google_api_content_warehouse v0.4.0)

Statistics of the anchors in a docjoin. Next available tag ID: 63.

Attributes

  • penguinLastUpdate (type: integer(), default: nil) - BEGIN: Penguin related fields. Timestamp when penguin scores were last updated. Measured in days since Jan. 1st 1995.
  • anchorCount (type: integer(), default: nil) -
  • badbacklinksPenalized (type: boolean(), default: nil) - Whether this doc is penalized by BadBackLinks, in which case we should not use improvanchor score in mustang ascorer.
  • penguinPenalty (type: number(), default: nil) - Page-level penguin penalty (0 = good, 1 = bad).
  • minHostHomePageLocalOutdegree (type: integer(), default: nil) - Minimum local outdegree of all anchor sources that are host home pages as well as on the same host as the current target URL.
  • droppedRedundantAnchorCount (type: integer(), default: nil) - Sum of anchors_dropped in the repeated group RedundantAnchorInfo, but can go higher if the latter reaches the cap of kMaxRecordsToKeep. (indexing/docjoiner/anchors/anchor-loader.cc), currently 10,000
  • nonLocalAnchorCount (type: integer(), default: nil) -
  • mediumCorpusAnchorCount (type: integer(), default: nil) -
  • penguinEarlyAnchorProtected (type: boolean(), default: nil) - Doc is protected by goodness of early anchors.
  • droppedHomepageAnchorCount (type: integer(), default: nil) -
  • redundantanchorinfoforphrasecap (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t), default: nil) -
  • forwardedOffdomainAnchorCount (type: integer(), default: nil) -
  • droppedNonLocalAnchorCount (type: integer(), default: nil) -
  • perdupstats (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t), default: nil) -
  • onsiteAnchorCount (type: integer(), default: nil) -
  • droppedLocalAnchorCount (type: integer(), default: nil) -
  • penguinTooManySources (type: boolean(), default: nil) - Doc not scored because it has too many anchor sources. END: Penguin related fields.
  • forwardedAnchorCount (type: integer(), default: nil) -
  • anchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t, default: nil) - This structure contains signals and penalties of AnchorSpamPenalizer. It replaces phrase_anchor_spam_info above, that is deprecated.
  • lowCorpusAnchorCount (type: integer(), default: nil) -
  • lowCorpusOffdomainAnchorCount (type: integer(), default: nil) -
  • baseAnchorCount (type: integer(), default: nil) -
  • minDomainHomePageLocalOutdegree (type: integer(), default: nil) - Minimum local outdegree of all anchor sources that are domain home pages as well as on the same domain as the current target URL.
  • skippedAccumulate (type: integer(), default: nil) - A count of the number of times anchor accumulation has been skipped for this document. Note: Only used when canonical.
  • topPrOnsiteAnchorCount (type: integer(), default: nil) - According to anchor quality bucket, anchor with pagrank > 51000 is the best anchor. anchors with pagerank < 47000 are all same.
  • pageMismatchTaggedAnchors (type: integer(), default: nil) -
  • spamLog10Odds (type: number(), default: nil) - The log base 10 odds that this set of anchors exhibits spammy behavior. Computed in the AnchorLocalizer.
  • redundantanchorinfo (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t), default: nil) -
  • pageFromExpiredTaggedAnchors (type: integer(), default: nil) - Set in SignalPenalizer::FillInAnchorStatistics.
  • baseOffdomainAnchorCount (type: integer(), default: nil) -
  • phraseAnchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t, default: nil) - Following signals identify spike of spammy anchor phrases. Anchors created during the spike are tagged with LINK_SPAM_PHRASE_SPIKE.
  • anchorPhraseCount (type: integer(), default: nil) - The number of unique anchor phrases. Capped by the constant kMaxAnchorPhraseCountInStats (=5000) defined in indexing/docjoiner/anchors/anchor-manager.cc.
  • ondomainAnchorCount (type: integer(), default: nil) -
  • totalDomainsAbovePhraseCap (type: integer(), default: nil) - Number of domains above per domain phrase cap. We see too many phrases in the domains.
  • totalDomainsSeen (type: integer(), default: nil) - Number of domains seen in total.
  • topPrOffdomainAnchorCount (type: integer(), default: nil) -
  • scannedAnchorCount (type: integer(), default: nil) - The total number of anchors being scanned from storage.
  • localAnchorCount (type: integer(), default: nil) -
  • linkBeforeSitechangeTaggedAnchors (type: integer(), default: nil) -
  • globalAnchorDelta (type: integer(), default: nil) - Metric of number of changed global anchors computed as, size(union(previous, new) - intersection(previous, new)).
  • topPrOndomainAnchorCount (type: integer(), default: nil) -
  • mediumCorpusOffdomainAnchorCount (type: integer(), default: nil) -
  • offdomainAnchorCount (type: integer(), default: nil) -
  • totalDomainPhrasePairsSeenApprox (type: integer(), default: nil) - Number of domain/phrase pairs in total -- i.e. how many anchors we would have if the domain/phrase cutoff was set to 1 instead of 200. This is "approx" for large anchor clusters because there can be double counting when the LRU cache forgets about rare domain/phrase pairs.
  • skippedOrReusedReason (type: String.t, default: nil) - Reason to skip accumulate, when skipped, or Reason for reprocessing when not skipped.
  • anchorsWithDedupedImprovanchors (type: integer(), default: nil) - The number of anchors for which some ImprovAnchors phrases have been removed due to duplication within source org.
  • fakeAnchorCount (type: integer(), default: nil) -
  • redundantAnchorForPhraseCapCount (type: integer(), default: nil) - Total anchor dropped due to exceed per domain phrase cap. Equals to sum of anchors_dropped in the repeated group RedundantAnchorInfoForPhraseCap, but can go higher if the latter reaches the cap of kMaxDomainsToKeepForPhraseCap (indexing/docjoiner/anchors/anchor-loader.h), currently 1000.
  • totalDomainPhrasePairsAboveLimit (type: integer(), default: nil) - The following should be equal to the size of the following repeated group, except that it can go higher than 10,000.
  • timestamp (type: integer(), default: nil) - Walltime of when anchors were accumulated last.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics{
  anchorCount: integer() | nil,
  anchorPhraseCount: integer() | nil,
  anchorSpamInfo:
    GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t()
    | nil,
  anchorsWithDedupedImprovanchors: integer() | nil,
  badbacklinksPenalized: boolean() | nil,
  baseAnchorCount: integer() | nil,
  baseOffdomainAnchorCount: integer() | nil,
  droppedHomepageAnchorCount: integer() | nil,
  droppedLocalAnchorCount: integer() | nil,
  droppedNonLocalAnchorCount: integer() | nil,
  droppedRedundantAnchorCount: integer() | nil,
  fakeAnchorCount: integer() | nil,
  forwardedAnchorCount: integer() | nil,
  forwardedOffdomainAnchorCount: integer() | nil,
  globalAnchorDelta: integer() | nil,
  linkBeforeSitechangeTaggedAnchors: integer() | nil,
  localAnchorCount: integer() | nil,
  lowCorpusAnchorCount: integer() | nil,
  lowCorpusOffdomainAnchorCount: integer() | nil,
  mediumCorpusAnchorCount: integer() | nil,
  mediumCorpusOffdomainAnchorCount: integer() | nil,
  minDomainHomePageLocalOutdegree: integer() | nil,
  minHostHomePageLocalOutdegree: integer() | nil,
  nonLocalAnchorCount: integer() | nil,
  offdomainAnchorCount: integer() | nil,
  ondomainAnchorCount: integer() | nil,
  onsiteAnchorCount: integer() | nil,
  pageFromExpiredTaggedAnchors: integer() | nil,
  pageMismatchTaggedAnchors: integer() | nil,
  penguinEarlyAnchorProtected: boolean() | nil,
  penguinLastUpdate: integer() | nil,
  penguinPenalty: number() | nil,
  penguinTooManySources: boolean() | nil,
  perdupstats:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t()
    ]
    | nil,
  phraseAnchorSpamInfo:
    GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t()
    | nil,
  redundantAnchorForPhraseCapCount: integer() | nil,
  redundantanchorinfo:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t()
    ]
    | nil,
  redundantanchorinfoforphrasecap:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t()
    ]
    | nil,
  scannedAnchorCount: integer() | nil,
  skippedAccumulate: integer() | nil,
  skippedOrReusedReason: String.t() | nil,
  spamLog10Odds: number() | nil,
  timestamp: integer() | nil,
  topPrOffdomainAnchorCount: integer() | nil,
  topPrOndomainAnchorCount: integer() | nil,
  topPrOnsiteAnchorCount: integer() | nil,
  totalDomainPhrasePairsAboveLimit: integer() | nil,
  totalDomainPhrasePairsSeenApprox: integer() | nil,
  totalDomainsAbovePhraseCap: integer() | nil,
  totalDomainsSeen: integer() | nil
}

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.