Description of the parameters

Options and functionalities might vary from one service to another. Nevertheless, we try to uniformize the parameters between the services as much as possible.

To help you find what you need in all these parameters, you may use the following table. The required parameters are blue.

Note that some parameters might not work with certain languages. The details can be found on the respective websites of the service you will use.

Parameter	Type	Description	Amazon	Google	Microsoft	Rev.ai	Speechmatics
`alternative_language_codes`	array(string)	The other language that might be spoken in the audio. It gives hint to the services to automatically determine spoken languages in your audio.	Yes	No	No	No	No
`attach_punctuation`	boolean	Allows you to merge punctuation item with the previous (or next if none) word.	Yes	Yes	Yes	Yes	Yes
`audio_channel_count`	integer	The audio data might include a channel for each speaker present on the recording.	Yes	No	See `channels`	Yes	No
`channels`	array(integer)	The audio data might include a channel for each speaker present on the recording. Indicates a collection of the requested channel numbers. In the default case, the channels 0 and 1 are considered.	See `audio_channel_count`	No	Yes	See `audio_channel_count`	No
`content_redaction`	string enum: redacted, redacted_and_unredacted	Enables automatic content redaction. Accepts redacted and redacted_and_unredacted. You might want to leave this argument empty if you don't want content redaction. The only redaction type available is PII.	Yes	No	No	No	No
`diarization_speaker_count`	integer ≥ 2	If speaker diarization is enabled, you can provide the number of speaker to improve the diarization.	Yes	No	No	No	No
`enable_automatic_punctuation`	boolean	Enables automatic punctuation.	No	Yes	See `punctuation_mode`	Yes	No
`enable_disfluencies`	boolean	Add disfluencies (such as "ums" and "uhs") in the result. The support is ensured for english only, but could work with other languages.	No	No	Yes	Yes	No
`enable_speaker_diarization`	boolean	Allows you to identify and separate the different speakers.	Yes	No	Yes	Yes	Yes
`enable_word_time_offsets`	boolean	Provides start and end timestamps for each word.	No	Yes	Yes	No	No
`language_code`	string	The language of the supplied audio as a BCP-47 language tag. (Example: en-US) You can find a list of these tags on this website (https://www.techonthenet.com/js/language_tags.php).	Yes	Yes	Yes	Yes enum: "es", "pt", "fr", "de", "en"	Yes
`max_alternatives`	integer	The number of alternative transcriptions to provide in the response. By default, no alternatives are provided.	Yes	Yes	No	No	No
`media_format`	string enum: mp3, mp4, wav, flac, ogg, amr, webm	The format of your audio file. Each service has restriction on their allowed format.	Yes	No	No	No	No
`phrases`	array(string)	A list of words and phrases that provides hints for the speech recognition.	No	Yes	No	Yes	Yes
`profanity_filter`	boolean	Indicates whether to filter out profane words or phrases.	No	Yes	See `profanity_filter_mode`	Yes	No
`profanity_filter_mode`	string enum: None, Masked, Removed, Tags	Indicates whether to filter out profane words or phrases. Can be None (deactivate filtering), Masked (replace word with asterisks), Removed (remove word) or Tags (put tags around the word).	No	See `profanity_filter`	Yes	See `profanity_filter`	No
`punctuation_mode`	string enum: None, Dictated, Automatic, DictatedAndAutomatic	Indicates the punctuation mode to use. Can be None (deactivate punctuation), Dictated (indicate an explicit punctuation), Automatic (allow decoder to process punctuation) or DictatedAndAutomatic (use dictated and automatic punctuation).	No	See `enable_automatic_punctuation`	Yes	See `enable_automatic_punctuation`	No
`sample_rate_hertz`	integer	The format of your audio file. Each service has restriction on their allowed format. The sample rate (in Hertz) of the supplied audio.	Yes	Yes	No	No	No
`use_enhanced_model`	boolean	Enhanced models might give better results but have a higher cost.	No	Yes	No	No	Yes
`vocabulary`	object { phrases }	Contains additional contextual information for processing this audio. This argument contains phrases, a list of words and phrases that provides hints for the speech recognition.	No	Yes	No	Yes	Yes