Skip to content

Latest commit

 

History

History
51 lines (39 loc) · 1.96 KB

File metadata and controls

51 lines (39 loc) · 1.96 KB

Audio and tool calling with LFM2.5-Audio-1.5B and LFM2-1.2B-Tool

Discord

What's inside?

Combines LFM2.5-Audio-1.5B in TTS and STT modes with LFM2-1.2B-Tool within a mockup of a car cockpit, letting the user control the car functionalities by voice.
All running locally in real-time.

demo_with_subtitles.mp4

Llama.cpp is used for both models inference, with a custom runner for the audio model. The car cockpit (UI) is vanilla js+html+css, and the communication with the backend is through messages over websocket, like a widely simplified car CAN bus.

Quick start

Note

Supported Platforms

The following platforms are currently supported:

  • macos-arm64
  • ubuntu-arm64
  • ubuntu-x64
  • ubuntu-WSL2
# Setup python env
make setup

# Optional: if llama-server is already in your PATH, symlink it instead of building
# ln -s $(which llama-server) llama-server
# Note: when building for ROCm, also install: sudo apt install -y libstdc++-14-dev

# Prepare the audio and tool calling models
make LFM2.5-Audio-1.5B-GGUF LFM2-1.2B-Tool-GGUF

# Launch demo
make -j2 audioserver serve

Note

Building llama-server from source

The make -j2 audioserver serve step will build llama-server automatically if it is not already present. This requires cmake and a C++ toolchain. If the build fails, install the missing dependencies first:

Platform Command
macOS brew install cmake (Xcode CLT required: xcode-select --install)
Linux / WSL2 make install-deps

Then re-run make -j2 audioserver serve.