> ## Documentation Index
> Fetch the complete documentation index at: https://docs.phidata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# MLX Transcribe

**MLX Transcribe** is a tool for transcribing audio files using MLX Whisper.

## Prerequisites

1. **Install ffmpeg**
   * macOS: `brew install ffmpeg`
   * Ubuntu: `sudo apt-get install ffmpeg`
   * Windows: Download from [https://ffmpeg.org/download.html](https://ffmpeg.org/download.html)

2. **Install mlx-whisper library**
   ```shell theme={null}
   pip install mlx-whisper
   ```

3. **Prepare audio files**
   * Create a 'storage/audio' directory
   * Place your audio files in this directory
   * Supported formats: mp3, mp4, wav, etc.

4. **Download sample audio** (optional)
   * Visit: [https://www.ted.com/talks/reid\_hoffman\_and\_kevin\_scott\_the\_evolution\_of\_ai\_and\_how\_it\_will\_impact\_human\_creativity](https://www.ted.com/talks/reid_hoffman_and_kevin_scott_the_evolution_of_ai_and_how_it_will_impact_human_creativity)
   * Save the audio file to 'storage/audio' directory

## Example

The following agent will use MLX Transcribe to transcribe audio files.

```python cookbook/tools/mlx_transcribe_tools.py theme={null}

from pathlib import Path
from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.tools.mlx_transcribe import MLXTranscribe

# Get audio files from storage/audio directory
phidata_root_dir = Path(__file__).parent.parent.parent.resolve()
audio_storage_dir = phidata_root_dir.joinpath("storage/audio")
if not audio_storage_dir.exists():
    audio_storage_dir.mkdir(exist_ok=True, parents=True)

agent = Agent(
    name="Transcription Agent",
    model=OpenAIChat(id="gpt-4o"),
    tools=[MLXTranscribe(base_dir=audio_storage_dir)],
    instructions=[
        "To transcribe an audio file, use the `transcribe` tool with the name of the audio file as the argument.",
        "You can find all available audio files using the `read_files` tool.",
    ],
    markdown=True,
)

agent.print_response("Summarize the reid hoffman ted talk, split into sections", stream=True)

```

## Toolkit Params

| Parameter                         | Type                           | Default                                  | Description                                  |
| --------------------------------- | ------------------------------ | ---------------------------------------- | -------------------------------------------- |
| `base_dir`                        | `Path`                         | `Path.cwd()`                             | Base directory for audio files               |
| `read_files_in_base_dir`          | `bool`                         | `True`                                   | Whether to register the read\_files function |
| `path_or_hf_repo`                 | `str`                          | `"mlx-community/whisper-large-v3-turbo"` | Path or HuggingFace repo for the model       |
| `verbose`                         | `bool`                         | `None`                                   | Enable verbose output                        |
| `temperature`                     | `float` or `Tuple[float, ...]` | `None`                                   | Temperature for sampling                     |
| `compression_ratio_threshold`     | `float`                        | `None`                                   | Compression ratio threshold                  |
| `logprob_threshold`               | `float`                        | `None`                                   | Log probability threshold                    |
| `no_speech_threshold`             | `float`                        | `None`                                   | No speech threshold                          |
| `condition_on_previous_text`      | `bool`                         | `None`                                   | Whether to condition on previous text        |
| `initial_prompt`                  | `str`                          | `None`                                   | Initial prompt for transcription             |
| `word_timestamps`                 | `bool`                         | `None`                                   | Enable word-level timestamps                 |
| `prepend_punctuations`            | `str`                          | `None`                                   | Punctuations to prepend                      |
| `append_punctuations`             | `str`                          | `None`                                   | Punctuations to append                       |
| `clip_timestamps`                 | `str` or `List[float]`         | `None`                                   | Clip timestamps                              |
| `hallucination_silence_threshold` | `float`                        | `None`                                   | Hallucination silence threshold              |
| `decode_options`                  | `dict`                         | `None`                                   | Additional decoding options                  |

## Toolkit Functions

| Function     | Description                                 |
| ------------ | ------------------------------------------- |
| `transcribe` | Transcribes an audio file using MLX Whisper |
| `read_files` | Lists all audio files in the base directory |

## Information

* View on [Github](https://github.com/agno-agi/phidata/blob/main/phi/tools/mlx_transcribe.py)
