ImageRepositoryWordInfo

AI Overview😉

  • The potential purpose of this module is to provide detailed information about recognized words in audio or video content, such as the confidence level of the recognition, the start and end times of the word, the speaker who said the word, and the word itself. This information can be used to improve the searchability and accessibility of multimedia content.
  • This module could impact search results by allowing Google to better understand the content of audio and video files, and to provide more accurate and relevant search results for users searching for specific words or phrases. For example, if a user searches for a specific quote from a video, this module could help Google to identify the exact timestamp of the quote and provide a more accurate search result.
  • A website may change things to be more favorable for this function by providing high-quality, accurately transcribed audio and video content, and by using schema markup to provide additional information about the content, such as the speaker and timestamps. Additionally, websites can optimize their content by using clear and concise language, and by providing closed captions or subtitles for audio and video content. This can help Google's algorithms to better understand the content and provide more accurate search results.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.ImageRepositoryWordInfo (google_api_content_warehouse v0.4.0)

Word-specific information for recognized words.

Attributes

  • confidence (type: number(), default: nil) - The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative of a non-streaming result or, of a streaming result where is_final=true. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  • endTime (type: String.t, default: nil) - Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
  • speakerTag (type: integer(), default: nil) - A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from '1' to diarization_speaker_count. speaker_tag is set if enable_speaker_diarization = 'true' and only in the top alternative.
  • startTime (type: String.t, default: nil) - Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
  • word (type: String.t, default: nil) - The word corresponding to this set of information.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.ImageRepositoryWordInfo{
  confidence: number() | nil,
  endTime: String.t() | nil,
  speakerTag: integer() | nil,
  startTime: String.t() | nil,
  word: String.t() | nil
}

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.