ImageRepositorySpeechRecognitionAlternative

AI Overview😉

  • The potential purpose of this module is to analyze and understand spoken words or audio content, transcribing it into text and providing confidence scores for the accuracy of the transcription. This could be used to improve search results by allowing users to search for audio or video content using spoken keywords.
  • This module could impact search results by allowing for more accurate and relevant results when searching for audio or video content. It could also enable features such as spoken keyword search, voice-based search filtering, and improved accessibility for users with disabilities.
  • To be more favorable for this function, a website could optimize its audio and video content by providing clear and high-quality audio, using descriptive and accurate metadata, and making transcripts of spoken content available. Additionally, websites could use schema markup to highlight spoken keywords and phrases, making it easier for search engines to understand and index the content.

Interesting Module? Vote 👇

Voting helps other researchers find interesting modules.

Current Votes: 0

GoogleApi.ContentWarehouse.V1.Model.ImageRepositorySpeechRecognitionAlternative (google_api_content_warehouse v0.4.0)

Alternative hypotheses (a.k.a. n-best list).

Attributes

  • confidence (type: number(), default: nil) - The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative of a non-streaming result or, of a streaming result where is_final=true. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  • transcript (type: String.t, default: nil) - Transcript text representing the words that the user spoke.
  • words (type: list(GoogleApi.ContentWarehouse.V1.Model.ImageRepositoryWordInfo.t), default: nil) - A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is true, you will see all the words from the beginning of the audio.

Summary

Types

t()

Functions

decode(value, options)

Unwrap a decoded JSON object into its complex fields.

Types

Link to this type

t()

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.ImageRepositorySpeechRecognitionAlternative{
    confidence: number() | nil,
    transcript: String.t() | nil,
    words:
      [GoogleApi.ContentWarehouse.V1.Model.ImageRepositoryWordInfo.t()] | nil
  }

Functions

Link to this function

decode(value, options)

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.