Tools
MLX Transcribe
MLX Transcribe is a tool for transcribing audio files using MLX Whisper.
Prerequisites
-
Install ffmpeg
- macOS:
brew install ffmpeg
- Ubuntu:
sudo apt-get install ffmpeg
- Windows: Download from https://ffmpeg.org/download.html
- macOS:
-
Install mlx-whisper library
-
Prepare audio files
- Create a ‘storage/audio’ directory
- Place your audio files in this directory
- Supported formats: mp3, mp4, wav, etc.
-
Download sample audio (optional)
- Visit: https://www.ted.com/talks/reid_hoffman_and_kevin_scott_the_evolution_of_ai_and_how_it_will_impact_human_creativity
- Save the audio file to ‘storage/audio’ directory
Example
The following agent will use MLX Transcribe to transcribe audio files.
cookbook/tools/mlx_transcribe_tools.py
Toolkit Params
Parameter | Type | Default | Description |
---|---|---|---|
base_dir | Path | Path.cwd() | Base directory for audio files |
read_files_in_base_dir | bool | True | Whether to register the read_files function |
path_or_hf_repo | str | "mlx-community/whisper-large-v3-turbo" | Path or HuggingFace repo for the model |
verbose | bool | None | Enable verbose output |
temperature | float or Tuple[float, ...] | None | Temperature for sampling |
compression_ratio_threshold | float | None | Compression ratio threshold |
logprob_threshold | float | None | Log probability threshold |
no_speech_threshold | float | None | No speech threshold |
condition_on_previous_text | bool | None | Whether to condition on previous text |
initial_prompt | str | None | Initial prompt for transcription |
word_timestamps | bool | None | Enable word-level timestamps |
prepend_punctuations | str | None | Punctuations to prepend |
append_punctuations | str | None | Punctuations to append |
clip_timestamps | str or List[float] | None | Clip timestamps |
hallucination_silence_threshold | float | None | Hallucination silence threshold |
decode_options | dict | None | Additional decoding options |
Toolkit Functions
Function | Description |
---|---|
transcribe | Transcribes an audio file using MLX Whisper |
read_files | Lists all audio files in the base directory |
Information
- View on Github
Was this page helpful?