Skip to main content

Local AI

Run supported local AI models on-device and chat with them (availability varies).

Local AI: chat view with a compact control bar.

Overview

Local AI provides an on-device chat UI with two backends:

  • Apple Foundation (when available on your OS/device)
  • LLM.swift (uses locally stored model files)

It also shows live CPU and memory usage so you can see the cost of loading and running a model.

Table Of Contents

Quick Start

  1. Open Tools -> Local AI.
  2. Choose a backend (Apple Foundation or LLM.swift).
  3. Tap Load.
  4. Type a prompt and send it.

Control Bar

At the top of the chat screen, the control bar shows:

  • Model status (unloaded/loading/loaded/unavailable)
  • Backend selection menu
  • Model picker (LLM.swift only)
  • Load / Unload button
  • Expand/collapse controls to reveal CPU and memory indicators
Tap Load to load the selected backend/model.
When loaded, the control bar shows a loaded state and exposes Unload.

Backends

Apple Foundation

Apple Foundation is only available when supported by your OS/device configuration. If it's not available, Lirum shows an unavailable message.

LLM.swift

LLM.swift requires a local model file to be available. Use the Model Library to manage models and ensure one is selected.

Model Library

Open the Model Library from the toolbar menu to download, manage, and select models.

Model Library: manage and select local models for the LLM.swift backend.
Model details and actions (varies by model/backend).

Loading And Unloading

  • Load initializes the selected backend/model.
  • Unload releases the model and clears the current conversation.

Large models can take time to load and may fail if the device doesn't have enough free memory.

Chat

The main UI is a standard chat view:

  • Type a prompt and send it.
  • While a response is streaming, you can stop generation.
Enter a prompt in the chat composer.
After sending, the assistant begins generating a response.
Example response shown in the chat history.

Performance Snapshot

Local AI tracks CPU and memory usage while you use the tool.

In the expanded controls (AI Model panel), you can capture a baseline snapshot and compare baseline vs current CPU/memory.

Export Conversation

Use Export Conversation to share the current chat as a file or text (depending on what's available).

Notes And Limitations

  • On-device models can use significant CPU and memory.
  • Model availability, download options, and performance vary by device and OS.
  • Some Apple Foundation features require newer OS versions and supported hardware.