Audio and tool calling with LFM2.5-Audio-1.5B and LFM2-1.2B-Tool

What's inside?

Combines LFM2.5-Audio-1.5B in TTS and STT modes with LFM2-1.2B-Tool within a mockup of a car cockpit, letting the user control the car functionalities by voice.
All running locally in real-time.

demo_with_subtitles.mp4

Llama.cpp is used for both models inference, with a custom runner for the audio model. The car cockpit (UI) is vanilla js+html+css, and the communication with the backend is through messages over websocket, like a widely simplified car CAN bus.

Quick start

Note

Supported Platforms

The following platforms are currently supported:

macos-arm64
ubuntu-arm64
ubuntu-x64
ubuntu-WSL2

# Setup python env
make setup

# Optional: if llama-server is already in your PATH, symlink it instead of building
# ln -s $(which llama-server) llama-server
# Note: when building for ROCm, also install: sudo apt install -y libstdc++-14-dev

# Prepare the audio and tool calling models
make LFM2.5-Audio-1.5B-GGUF LFM2-1.2B-Tool-GGUF

# Launch demo
make -j2 audioserver serve

Note

Building llama-server from source

The make -j2 audioserver serve step will build llama-server automatically if it is not already present. This requires cmake and a C++ toolchain. If the build fails, install the missing dependencies first:

Platform	Command
macOS	`brew install cmake` (Xcode CLT required: `xcode-select --install`)
Linux / WSL2	`make install-deps`

Then re-run make -j2 audioserver serve.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio and tool calling with LFM2.5-Audio-1.5B and LFM2-1.2B-Tool

What's inside?

Quick start

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Audio and tool calling with LFM2.5-Audio-1.5B and LFM2-1.2B-Tool

What's inside?

Quick start