  • The potential purpose of this module is to analyze and store metadata about academic papers and articles, particularly those in PDF format. It appears to be focused on extracting information about the content, structure, and accessibility of these documents, as well as their relationships to other papers and journals.
  • This module could impact search results by influencing the ranking and display of academic papers and articles in search engine results pages (SERPs). It may prioritize papers that are more easily accessible, have higher-quality metadata, or are more closely related to other relevant papers. This could affect the visibility and discoverability of certain papers, particularly those from lesser-known journals or authors.
  • To be more favorable for this function, a website may want to ensure that their academic papers and articles have accurate and complete metadata, including information about the content, authors, and publication dates. They may also want to make sure that their PDFs are easily crawlable and indexable by search engines, and that they provide clear and consistent linking structures between related papers and journals. Additionally, they may want to consider optimizing their content for accessibility and readability, as this appears to be a factor in the module's analysis.

GoogleApi.ContentWarehouse.V1.Model.ScienceCitationDownloadURL (google_api_content_warehouse v0.4.0)

Download URL mentioned in citation; we keep up to K of them LINT.IfChange


  • DownloadDay (type: integer(), default: nil) -
  • LegalMustInclude (type: boolean(), default: nil) - e.g., in law_articles.pat
  • DisplayPriority (type: integer(), default: nil) - display preference score
  • PageCount (type: integer(), default: nil) - Number of pages in the pdf2html conversion output. Only set for PDFs. For a partitioned PDF, this is the page count of the entire volume.
  • LikelyWorldViewable (type: boolean(), default: nil) - Likely to be free-to-read for everyone, after accounting for library links etc.
  • MetadataUrl (type: String.t, default: nil) - url of publisher metadata file
  • NoIndex (type: boolean(), default: nil) - metatag: don't display this url
  • ReferencesInPrevIndex (type: boolean(), default: nil) - were references parsed in a previous index
  • CanonicalUrlfp (type: String.t, default: nil) -
  • NoSnippet (type: boolean(), default: nil) - metatag: don't show snippet
  • BrokenLandingPage (type: boolean(), default: nil) - set if we know the landing page is broken
  • DownloadYear (type: integer(), default: nil) - no abbrv
  • WorldViewable (type: boolean(), default: nil) - metatag: is viewable by world
  • UrlAfterRedirects (type: String.t, default: nil) -
  • ContentChecksum (type: String.t, default: nil) - checksum of the page
  • ExcerptDebugLabel (type: String.t, default: nil) - label for excerpt (abstract, summary, ..)
  • ContentType (type: integer(), default: nil) - makes gws display nicer :)
  • LongChunkCount (type: integer(), default: nil) - number of long paragraphs
  • MustInclude (type: boolean(), default: nil) - e.g., in science_articles.pat
  • FirstDiscovered (type: String.t, default: nil) - seconds since the epoch
  • IndexPriority (type: integer(), default: nil) - indexing preference score
  • HtmlTitle (type: String.t, default: nil) - html title of the page
  • NoArchive (type: boolean(), default: nil) - metatag: don't show cached version
  • DownloadMonth (type: integer(), default: nil) - DownloadMonth is a zero-indexed field (0 is January).
  • CrawlTimestamp (type: String.t, default: nil) - seconds since the epoch
  • LikelyDifferentMetricsVenue (type: boolean(), default: nil) - In the context of a given venue in Scholar Metrics, whether this URL likely does not link to the current venue.
  • UrlStr (type: String.t, default: nil) -
  • HostedStartPage (type: integer(), default: nil) -
  • OutLinkCount (type: integer(), default: nil) - number of external URLs (in PDF).
  • LikelyNoCache (type: boolean(), default: nil) - badurls_nocache at indexing time
  • LikelyLegalJournal (type: boolean(), default: nil) - e.g., in legal_journals.pat
  • Type (type: integer(), default: nil) - ArticleType for this particular url
  • MaybeNoIndexReparse (type: boolean(), default: nil) - Incremental only: mark as NoIndexed if this is a reparse and the base version is NoIndexed.
  • LikelyAheadPrint (type: boolean(), default: nil) - Whether this is likely the URL for an ahead print, at indexing time.
  • InPrevIndex (type: boolean(), default: nil) - is url included in a previous index
  • DisplayOrg (type: String.t, default: nil) - publisher display name
  • WordCount (type: integer(), default: nil) - number of words in content/body
  • OceanView (type: GoogleApi.ContentWarehouse.V1.Model.ScienceOceanView.t, default: nil) - describes whether url is viewable in ocean
  • DMCANotice (type: String.t, default: nil) - metatag: URL; result was taken down
  • LikelyNoIndex (type: boolean(), default: nil) - badurls_noreturngws at indexing time
  • ExcerptContent (type: String.t, default: nil) - first few lines of abstract'ish excerpt
  • HostedNumPages (type: integer(), default: nil) - explicit zero means hosting failed





decode(value, options)

Unwrap a decoded JSON object into its complex fields.


Link to this type


@type t() :: %GoogleApi.ContentWarehouse.V1.Model.ScienceCitationDownloadURL{
  BrokenLandingPage: boolean() | nil,
  CanonicalUrlfp: String.t() | nil,
  ContentChecksum: String.t() | nil,
  ContentType: integer() | nil,
  CrawlTimestamp: String.t() | nil,
  DMCANotice: String.t() | nil,
  DisplayOrg: String.t() | nil,
  DisplayPriority: integer() | nil,
  DownloadDay: integer() | nil,
  DownloadMonth: integer() | nil,
  DownloadYear: integer() | nil,
  ExcerptContent: String.t() | nil,
  ExcerptDebugLabel: String.t() | nil,
  FirstDiscovered: String.t() | nil,
  HostedNumPages: integer() | nil,
  HostedStartPage: integer() | nil,
  HtmlTitle: String.t() | nil,
  InPrevIndex: boolean() | nil,
  IndexPriority: integer() | nil,
  LegalMustInclude: boolean() | nil,
  LikelyAheadPrint: boolean() | nil,
  LikelyDifferentMetricsVenue: boolean() | nil,
  LikelyLegalJournal: boolean() | nil,
  LikelyNoCache: boolean() | nil,
  LikelyNoIndex: boolean() | nil,
  LikelyWorldViewable: boolean() | nil,
  LongChunkCount: integer() | nil,
  MaybeNoIndexReparse: boolean() | nil,
  MetadataUrl: String.t() | nil,
  MustInclude: boolean() | nil,
  NoArchive: boolean() | nil,
  NoIndex: boolean() | nil,
  NoSnippet: boolean() | nil,
  OceanView: GoogleApi.ContentWarehouse.V1.Model.ScienceOceanView.t() | nil,
  OutLinkCount: integer() | nil,
  PageCount: integer() | nil,
  ReferencesInPrevIndex: boolean() | nil,
  Type: integer() | nil,
  UrlAfterRedirects: String.t() | nil,
  UrlStr: String.t() | nil,
  WordCount: integer() | nil,
  WorldViewable: boolean() | nil


Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.