Local AI

Run supported local AI models on-device and chat with them (availability varies).

Overview

Local AI provides an on-device chat UI with two backends:

Apple Foundation (when available on your OS/device)
LLM.swift (uses locally stored model files)

It also shows live CPU and memory usage so you can see the cost of loading and running a model.

Quick Start
Control Bar
Backends
Model Library
Loading And Unloading
Chat
Performance Snapshot
Export Conversation
Notes And Limitations

Quick Start

Open Tools -> Local AI.
Choose a backend (Apple Foundation or LLM.swift).
Tap Load.
Type a prompt and send it.

Control Bar

At the top of the chat screen, the control bar shows:

Model status (unloaded/loading/loaded/unavailable)
Backend selection menu
Model picker (LLM.swift only)
Load / Unload button
Expand/collapse controls to reveal CPU and memory indicators

Local AI with Load tapped in the control bar — Tap Load to load the selected backend/model.

Local AI showing loaded state — When loaded, the control bar shows a loaded state and exposes Unload.

Backends

Apple Foundation

Apple Foundation is only available when supported by your OS/device configuration. If it's not available, Lirum shows an unavailable message.

LLM.swift

LLM.swift requires a local model file to be available. Use the Model Library to manage models and ensure one is selected.

Model Library

Open the Model Library from the toolbar menu to download, manage, and select models.

Local AI model details screen — Model details and actions (varies by model/backend).

Loading And Unloading

Load initializes the selected backend/model.
Unload releases the model and clears the current conversation.

Large models can take time to load and may fail if the device doesn't have enough free memory.

Chat

The main UI is a standard chat view:

Type a prompt and send it.
While a response is streaming, you can stop generation.

Prompt entered in Local AI chat — Enter a prompt in the chat composer.

Prompt sent in Local AI chat — After sending, the assistant begins generating a response.

Response received in Local AI chat — Example response shown in the chat history.

Performance Snapshot

Local AI tracks CPU and memory usage while you use the tool.

In the expanded controls (AI Model panel), you can capture a baseline snapshot and compare baseline vs current CPU/memory.

Export Conversation

Use Export Conversation to share the current chat as a file or text (depending on what's available).

Notes And Limitations

On-device models can use significant CPU and memory.
Model availability, download options, and performance vary by device and OS.
Some Apple Foundation features require newer OS versions and supported hardware.

Overview​

Table Of Contents​

Quick Start​

Control Bar​

Backends​

Apple Foundation​

LLM.swift​

Model Library​

Loading And Unloading​

Chat​

Performance Snapshot​

Export Conversation​

Notes And Limitations​