diff --git a/README.md b/README.md index 40bd89c95..d9a21ee43 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ Battle tested at [ Apple Intelligence & Siri + +> Apple Intelligence is not available on Intel Macs or older macOS versions. RubyLLM will raise an error if the requirements aren't met. +{: .note } + +## Quick Start + +No configuration needed. Just use it: + +```ruby +chat = RubyLLM.chat(model: "apple-intelligence", provider: :apple_intelligence) +chat.ask "Explain Ruby's block syntax" +``` + +That's it. No API keys, no environment variables, no account setup. The `osx-ai-inloop` binary is automatically downloaded and cached on first use. + +## Conversation History + +Apple Intelligence supports multi-turn conversations, just like any other provider: + +```ruby +chat = RubyLLM.chat(model: "apple-intelligence", provider: :apple_intelligence) +chat.ask "What is a Ruby module?" +chat.ask "How is that different from a class?" +chat.ask "When should I use one over the other?" +``` + +Each follow-up includes the full conversation history, so the model maintains context across turns. + +## Configuration + +### Zero Config (Default) + +Apple Intelligence works out of the box with no configuration. RubyLLM automatically downloads the `osx-ai-inloop` binary to `~/.ruby_llm/bin/osx-ai-inloop` on first use. + +### Custom Binary Path + +If you prefer to manage the binary location yourself: + +```ruby +RubyLLM.configure do |config| + config.apple_intelligence_binary_path = "/opt/bin/osx-ai-inloop" +end +``` + +### Setting as Default Model + +To use Apple Intelligence as your default chat model: + +```ruby +RubyLLM.configure do |config| + config.default_model = "apple-intelligence" +end + +# Now RubyLLM.chat uses Apple Intelligence automatically +chat = RubyLLM.chat(provider: :apple_intelligence) +chat.ask "Hello!" +``` + +## How It Works + +1. RubyLLM formats your conversation as a JSON payload +2. The payload is piped to the `osx-ai-inloop` binary via stdin +3. The binary communicates with Apple's Foundation Models on-device +4. The response is read from stdout and parsed back into RubyLLM's standard format + +The binary is sourced from the [osx-ai-inloop](https://github.com/inloopstudio-team/apple-intelligence-inloop) project and cached at `~/.ruby_llm/bin/osx-ai-inloop`. + +## Limitations + +Apple Intelligence is text-only and runs entirely on-device. This means: + +* **No streaming** — responses are returned all at once +* **No vision** — image analysis is not supported +* **No tool calling** — function/tool use is not available +* **No embeddings** — use another provider for `RubyLLM.embed` +* **No image generation** — use another provider for `RubyLLM.paint` +* **macOS only** — requires Apple Silicon and macOS 26+ + +For capabilities that Apple Intelligence doesn't support, you can use another provider alongside it: + +```ruby +# Local AI for chat +local_chat = RubyLLM.chat(model: "apple-intelligence", provider: :apple_intelligence) +local_chat.ask "Summarize this concept" + +# Cloud provider for embeddings +RubyLLM.embed "Ruby is elegant and expressive" +``` + +## Troubleshooting + +### "Platform not supported" error + +Apple Intelligence requires macOS 26+ on Apple Silicon. Verify your setup: + +* Check macOS version: Apple menu > About This Mac +* Ensure Apple Intelligence is enabled: System Settings > Apple Intelligence & Siri + +### Binary download fails + +If the automatic download fails (network issues, firewall, etc.), download manually: + +```bash +wget -O ~/.ruby_llm/bin/osx-ai-inloop \ + https://github.com/inloopstudio-team/apple-intelligence-inloop/raw/refs/heads/main/bin/osx-ai-inloop-arm64 +chmod +x ~/.ruby_llm/bin/osx-ai-inloop +``` + +### Binary not found at custom path + +If you configured a custom binary path, ensure the file exists and is executable: + +```bash +ls -la /your/custom/path/osx-ai-inloop +chmod +x /your/custom/path/osx-ai-inloop +``` + +## Next Steps + +Now that you have local AI running, explore other RubyLLM features: + +- [Chat with AI models]({% link _core_features/chat.md %}) for more conversation features +- [Configuration]({% link _getting_started/configuration.md %}) for multi-provider setups +- [Tools and function calling]({% link _core_features/tools.md %}) with cloud providers diff --git a/docs/_getting_started/configuration.md b/docs/_getting_started/configuration.md index b5686cb2e..978635f01 100644 --- a/docs/_getting_started/configuration.md +++ b/docs/_getting_started/configuration.md @@ -107,6 +107,26 @@ end > Attempting to use an unconfigured provider will raise `RubyLLM::ConfigurationError`. Only configure what you need. {: .note } +### Apple Intelligence (On-Device) + +Apple Intelligence requires no API keys — it runs entirely on your Mac. Just use it: + +```ruby +chat = RubyLLM.chat(model: "apple-intelligence", provider: :apple_intelligence) +chat.ask "Hello from on-device AI!" +``` + +The `osx-ai-inloop` binary is automatically downloaded on first use. To customize its location: + +```ruby +RubyLLM.configure do |config| + config.apple_intelligence_binary_path = "/opt/bin/osx-ai-inloop" +end +``` + +> Apple Intelligence requires macOS 26+ (Tahoe) on Apple Silicon with Apple Intelligence enabled. See the [Apple Intelligence guide]({% link _getting_started/apple-intelligence.md %}) for full details. +{: .note } + ### OpenAI Organization & Project Headers For OpenAI users with multiple organizations or projects: @@ -450,6 +470,9 @@ Here's a complete reference of all configuration options: ```ruby RubyLLM.configure do |config| + # Apple Intelligence (on-device, no API key needed) + config.apple_intelligence_binary_path = String # Optional: custom binary path + # Anthropic config.anthropic_api_key = String config.anthropic_api_base = String # v1.13.0+ diff --git a/docs/_getting_started/overview.md b/docs/_getting_started/overview.md index 2c7cbe747..69c4e036c 100644 --- a/docs/_getting_started/overview.md +++ b/docs/_getting_started/overview.md @@ -149,6 +149,9 @@ chat = RubyLLM.chat( model: "{{ site.models.local_llama }}", provider: :ollama, ) + +# On-device AI with Apple Intelligence — no API keys, no cloud +chat = RubyLLM.chat(model: "apple-intelligence", provider: :apple_intelligence) ``` ### Capability Management diff --git a/lib/ruby_llm.rb b/lib/ruby_llm.rb index 87bc94c9d..1332b286a 100644 --- a/lib/ruby_llm.rb +++ b/lib/ruby_llm.rb @@ -15,6 +15,7 @@ loader = Zeitwerk::Loader.for_gem loader.inflector.inflect( + 'apple_intelligence' => 'AppleIntelligence', 'azure' => 'Azure', 'UI' => 'UI', 'api' => 'API', @@ -93,6 +94,7 @@ def logger end end +RubyLLM::Provider.register :apple_intelligence, RubyLLM::Providers::AppleIntelligence RubyLLM::Provider.register :anthropic, RubyLLM::Providers::Anthropic RubyLLM::Provider.register :azure, RubyLLM::Providers::Azure RubyLLM::Provider.register :bedrock, RubyLLM::Providers::Bedrock diff --git a/lib/ruby_llm/providers/apple_intelligence.rb b/lib/ruby_llm/providers/apple_intelligence.rb new file mode 100644 index 000000000..8f0f9e2d6 --- /dev/null +++ b/lib/ruby_llm/providers/apple_intelligence.rb @@ -0,0 +1,83 @@ +# frozen_string_literal: true + +module RubyLLM + module Providers + # Apple Intelligence provider — pipes requests through the osx-ai-inloop + # binary via stdin/stdout, completely bypassing HTTP/Faraday. + class AppleIntelligence < Provider + include AppleIntelligence::Chat + include AppleIntelligence::Models + + def initialize(config) + super + @config = config + @connection = nil + end + + def api_base + nil + end + + # rubocop:disable Metrics/ParameterLists,Metrics/PerceivedComplexity + def complete(messages, tools: nil, temperature: nil, model: nil, params: {}, headers: {}, schema: nil, + thinking: nil, tool_prefs: nil, &) + _ = [temperature, model, params, headers, schema, thinking, tool_prefs] # not used for local provider + + # Two-pass tool calling: if tools are registered and we haven't already + # executed a tool (no :tool messages yet), extract arguments and call. + if tools&.any? && messages.none? { |m| m.role == :tool } + last_user = messages.reverse.find { |m| m.role == :user } + tool_msg = try_tool_call(tools, last_user, @config) if last_user + return tool_msg if tool_msg + end + + payload = build_payload(messages) + execute_binary(payload, @config) + end + # rubocop:enable Metrics/ParameterLists,Metrics/PerceivedComplexity + + class << self + def configuration_options + %i[apple_intelligence_binary_path] + end + + def configuration_requirements + [] + end + + def local? + true + end + + def assume_models_exist? + true + end + + def capabilities + AppleIntelligence::Capabilities + end + end + + private + + def try_tool_call(tools, last_user, config) + user_text = case last_user.content + when String then last_user.content + when Content then last_user.content.text || '' + else last_user.content.to_s + end + tool_result = resolve_tool_call(tools, user_text, config) + return unless tool_result + + Message.new( + role: :assistant, + content: '', + tool_calls: tool_result, + model_id: 'apple-intelligence', + input_tokens: 0, + output_tokens: 0 + ) + end + end + end +end diff --git a/lib/ruby_llm/providers/apple_intelligence/binary_manager.rb b/lib/ruby_llm/providers/apple_intelligence/binary_manager.rb new file mode 100644 index 000000000..94a2fbca0 --- /dev/null +++ b/lib/ruby_llm/providers/apple_intelligence/binary_manager.rb @@ -0,0 +1,56 @@ +# frozen_string_literal: true + +require 'open-uri' +require 'fileutils' + +module RubyLLM + module Providers + class AppleIntelligence + # Manages downloading, caching, and locating the osx-ai-inloop binary + module BinaryManager + BINARY_URL = 'https://github.com/inloopstudio-team/apple-intelligence-inloop/raw/refs/heads/main/bin/osx-ai-inloop-arm64' + DEFAULT_CACHE_DIR = File.join(Dir.home, '.ruby_llm', 'bin') + DEFAULT_BINARY_NAME = 'osx-ai-inloop' + + module_function + + def binary_path(config = nil) + custom = config&.apple_intelligence_binary_path + return custom if custom && File.executable?(custom) + + default_path = File.join(DEFAULT_CACHE_DIR, DEFAULT_BINARY_NAME) + ensure_binary!(default_path) unless File.executable?(default_path) + default_path + end + + def ensure_binary!(path) + check_platform! + download_binary!(path) + File.chmod(0o755, path) + end + + def check_platform! + raise RubyLLM::Error, 'Apple Intelligence provider requires macOS' unless RUBY_PLATFORM.include?('darwin') + + return if RUBY_PLATFORM.include?('arm64') + + RubyLLM.logger.warn('Apple Intelligence binary is built for arm64. ' \ + 'It may not work on this architecture.') + end + + def download_binary!(path) + FileUtils.mkdir_p(File.dirname(path)) + RubyLLM.logger.info("Downloading osx-ai-inloop binary to #{path}...") + + URI.open(BINARY_URL, 'rb') do |remote| # rubocop:disable Security/Open + File.binwrite(path, remote.read) + end + + RubyLLM.logger.info('Binary downloaded successfully.') + rescue OpenURI::HTTPError, SocketError, Errno::ECONNREFUSED => e + raise RubyLLM::Error, "Failed to download Apple Intelligence binary: #{e.message}" + end + end + end + end +end diff --git a/lib/ruby_llm/providers/apple_intelligence/capabilities.rb b/lib/ruby_llm/providers/apple_intelligence/capabilities.rb new file mode 100644 index 000000000..00aae98ac --- /dev/null +++ b/lib/ruby_llm/providers/apple_intelligence/capabilities.rb @@ -0,0 +1,20 @@ +# frozen_string_literal: true + +module RubyLLM + module Providers + class AppleIntelligence + # Capability declarations for Apple Intelligence on-device models + module Capabilities + module_function + + def supports_tool_choice?(_model_id) + false + end + + def supports_tool_parallel_control?(_model_id) + false + end + end + end + end +end diff --git a/lib/ruby_llm/providers/apple_intelligence/chat.rb b/lib/ruby_llm/providers/apple_intelligence/chat.rb new file mode 100644 index 000000000..f0d5f1e3a --- /dev/null +++ b/lib/ruby_llm/providers/apple_intelligence/chat.rb @@ -0,0 +1,219 @@ +# frozen_string_literal: true + +require 'open3' +require 'json' +require 'securerandom' + +module RubyLLM + module Providers + class AppleIntelligence + # Chat completion via the osx-ai-inloop binary pipe + module Chat + EXIT_CODE_ERRORS = { + 1 => 'Invalid arguments', + 2 => 'Unsupported environment', + 3 => 'Unavailable model', + 4 => 'Generation failure', + 5 => 'Internal error' + }.freeze + + private + + def build_payload(messages) # rubocop:disable Metrics/PerceivedComplexity + system_prompt = nil + conversation = [] + + messages.each do |msg| + case msg.role + when :system + system_prompt = extract_text(msg.content) + when :user, :assistant, :tool + conversation << msg + end + end + + latest_user_message = extract_text(conversation.pop.content) if conversation.last&.role == :user + + # After tool execution the last message is :tool (the result). + # Synthesize a prompt so the model can answer using the tool output. + if latest_user_message.nil? || latest_user_message.empty? + tool_results = conversation.select { |m| m.role == :tool }.map { |m| extract_text(m.content) } + user_msg = conversation.reverse.find { |m| m.role == :user } + original_question = user_msg ? extract_text(user_msg.content) : 'the user question' + + latest_user_message = "Answer this question: #{original_question}\n\n" \ + "Use this data: #{tool_results.join('; ')}" + conversation = [] + end + + input_parts = conversation.map { |msg| format_conversation_message(msg) } + + payload = { + prompt: latest_user_message, + model: 'on-device', + format: 'json', + stream: false + } + payload[:system] = system_prompt if system_prompt + payload[:input] = input_parts.join("\n") unless input_parts.empty? + payload + end + + def format_conversation_message(msg) + if msg.role == :tool + tool_name = msg.tool_call_id || 'unknown' + "tool_result (#{tool_name}): #{extract_text(msg.content)}" + else + "#{msg.role}: #{extract_text(msg.content)}" + end + end + + def extract_text(content) + case content + when String then content + when Content then content.text || content.to_s + else content.to_s + end + end + + def execute_binary(payload, config) + bin = BinaryManager.binary_path(config) + json_input = JSON.generate(payload) + + stdout, stderr, status = Open3.capture3(bin, stdin_data: json_input) + + handle_exit_code(status, stdout, stderr) + parse_binary_response(stdout) + end + + def resolve_tool_call(tools, user_message, config) # rubocop:disable Metrics/PerceivedComplexity + return nil unless tools&.any? + + tool_name, tool = tools.first + + # Zero-parameter tools: call immediately + if tool.parameters.empty? + call_id = "call_#{SecureRandom.hex(8)}" + return { call_id => ToolCall.new(id: call_id, name: tool_name.to_s, arguments: {}) } + end + + extract_tool_arguments(tool_name, tool, user_message, config) + rescue StandardError => e + RubyLLM.logger.debug { "Tool call resolution failed: #{e.message}" } + nil + end + + def extract_tool_arguments(tool_name, tool, user_message, config) + arguments = {} + bin = BinaryManager.binary_path(config) + + tool.parameters.each_value do |param| + value = extract_single_param(bin, param.name, user_message) + arguments[param.name.to_sym] = value if value && !value.empty? + end + + return nil if arguments.empty? + + call_id = "call_#{SecureRandom.hex(8)}" + { call_id => ToolCall.new(id: call_id, name: tool_name.to_s, arguments: arguments) } + end + + def extract_single_param(bin, param_name, user_message) + prompt = "What #{param_name} is mentioned in this text? " \ + "Reply with just the value, nothing else.\n\n#{user_message}" + payload = { prompt: prompt, model: 'on-device', format: 'json', stream: false } + + stdout, _stderr, status = Open3.capture3(bin, stdin_data: JSON.generate(payload)) + return nil unless status.success? + + body = begin + JSON.parse(stdout) + rescue JSON::ParserError + return nil + end + return nil unless body['ok'] + + parse_extracted_value(body['output']&.strip, param_name) + end + + def parse_extracted_value(raw_output, param_name) + return nil if raw_output.nil? || raw_output.empty? + + parsed = JSON.parse(raw_output) + parsed.is_a?(Hash) ? (parsed[param_name.to_s] || parsed.values.first).to_s : parsed.to_s + rescue JSON::ParserError + raw_output.gsub(/\A["']|["']\z/, '') + end + + def handle_exit_code(status, stdout, stderr) + return if status.success? + + code = status.exitstatus + error_msg = EXIT_CODE_ERRORS[code] || "Unknown error (exit code #{code})" + + begin + body = JSON.parse(stdout) + error_msg = "#{body['error']['code']}: #{body['error']['message']}" if body['error'] + rescue JSON::ParserError + error_msg = "#{error_msg} — #{stderr}" unless stderr.empty? + end + + raise_for_exit_code(code, error_msg) + end + + def raise_for_exit_code(code, error_msg) + case code + when 1 then raise RubyLLM::BadRequestError, error_msg + when 2 then raise RubyLLM::Error, "Unsupported environment: #{error_msg}" + when 3 then raise RubyLLM::ModelNotFoundError, error_msg + when 4, 5 then raise RubyLLM::ServerError, error_msg + else raise RubyLLM::Error, error_msg + end + end + + def parse_binary_response(stdout) + body = JSON.parse(stdout) + + unless body['ok'] + error = body['error'] || {} + raise RubyLLM::Error, "#{error['code']}: #{error['message']}" + end + + output_text = body['output'] || '' + tool_calls = extract_tool_calls(output_text) + + Message.new( + role: :assistant, + content: tool_calls ? '' : output_text, + tool_calls: tool_calls, + model_id: body['model'] || 'apple-intelligence', + input_tokens: 0, + output_tokens: estimate_tokens(output_text), + raw: body + ) + rescue JSON::ParserError => e + raise RubyLLM::Error, "Failed to parse binary response: #{e.message}" + end + + def extract_tool_calls(text) + parsed = JSON.parse(text.strip) + return nil unless parsed.is_a?(Hash) && parsed['tool_call'] + + tc = parsed['tool_call'] + return nil unless tc['name'] + + call_id = "call_#{SecureRandom.hex(8)}" + arguments = (tc['arguments'] || {}).transform_keys(&:to_sym) + + { call_id => ToolCall.new(id: call_id, name: tc['name'], arguments: arguments) } + rescue JSON::ParserError + nil + end + + def estimate_tokens(text) + (text.length / 4.0).ceil + end + end + end + end +end diff --git a/lib/ruby_llm/providers/apple_intelligence/models.rb b/lib/ruby_llm/providers/apple_intelligence/models.rb new file mode 100644 index 000000000..60a0255d4 --- /dev/null +++ b/lib/ruby_llm/providers/apple_intelligence/models.rb @@ -0,0 +1,38 @@ +# frozen_string_literal: true + +module RubyLLM + module Providers + class AppleIntelligence + # Model definitions for Apple Intelligence on-device models + module Models + module_function + + def models_url + nil + end + + def parse_list_models_response(_response, slug, _capabilities) + [ + Model::Info.new( + id: 'apple-intelligence', + name: 'Apple Intelligence (on-device)', + provider: slug, + family: 'apple-intelligence', + created_at: nil, + modalities: { + input: %w[text], + output: %w[text] + }, + capabilities: [], + pricing: {}, + metadata: { + local: true, + description: 'Apple Foundation Model running on-device via Apple Intelligence' + } + ) + ] + end + end + end + end +end