AURA AI

AURA AI is a local-first browser studio for generating, organizing, editing, and iterating on AI images with OpenAI and Google-hosted image models.

The app runs entirely in the browser. Provider API keys, generated images, reference images, layer assets, working session state, archive metadata, and lineage history stay on the local device instead of passing through an application backend.

The interface follows the Telepathic Instruments-inspired visual system documented in docs/DESIGN.md: stark panels, monochrome surfaces, amber action emphasis, compact controls, and typography tuned for a focused creative tool rather than a marketing page.

Screens

Generate	Archive

Editor	Detail

Highlights

Prompt-based image generation with gpt-image-2 and nano-banana-pro
Single Shot and Autopilot generation modes
Batch generation of up to four images per run with a per-slot result grid, save-all, and per-result reuse actions
Streaming partial-image previews during single-shot generation for models that support it
Reuse any generated result as a reference image with a single action
Actual generation parameter reporting for revised prompts, size, quality, and elapsed time
Goal-to-prompt translation, iterative scoring, and prompt refinement with selectable reasoning models: gpt-5.4 and gemini-2.5-flash
Provider-specific API key storage for OpenAI and Google
Prompt enhancement controls for style, lighting, palette, and model-specific output settings
Shared image-model facts for Generate and Editor controls, provider routing, capabilities, reference limits, and archive metadata
Reference-image workflows for guided generation and AI-assisted edits, including clipboard paste
Transform-mask painting for targeted AI edits, persisted in lineage and replayable into the editor
Creative lineage tracking across generation, create-similar, editor saves, AI edits, save-as-copy branches, and Autopilot iterations
Local archive with search, favorites filtering, multi-select actions, layer-aware ZIP export/import, manifest recovery, lineage-aware detail view, replay actions, fork actions, and keyboard navigation
Layered in-browser editor with image layers, blend modes, layer locking, drag reordering, keyboard nudging, live composition adjustments, AI result layers, non-destructive drafts, undo/redo, overwrite, save-as-copy, reset, and revert controls
Background completion notifications for finished generation runs
Persistent local UI state for prompts, model-specific generation settings, Autopilot settings, archive search and favorites filter, editor drafts, editor controls, and notification preferences
Local-first persistence powered by SQLocal and IndexedDB

Tech Stack

React 19
TypeScript
Vite 7
SQLocal for browser-local SQLite metadata
idb-keyval for binary and transient IndexedDB storage
JSZip for archive export bundles
Konva and React Konva for the layered editor canvas
Lucide React for iconography
Vitest for module and workflow tests

Runtime Requirements

Node.js 20.19+ or 22.12+
npm 10+

Getting Started

npm install
npm run dev

Open the app in your browser, go to Settings, and enter the provider keys for the models you want to use. OpenAI powers gpt-image-2 and gpt-5.4; Google powers nano-banana-pro and gemini-2.5-flash.

Available Scripts

npm run dev
npm run test
npm run typecheck
npm run build
npm run lint
npm run audit
npm run audit:fix
npm run preview

Script Reference

npm run dev Starts the Vite development server.
npm run dev -- --port 5175 Starts the Vite development server on a custom port.
npm run test Runs the Vitest suite in non-watch mode.
npm run typecheck Runs the TypeScript project build in type-check mode.
npm run build Type-checks the app and creates a production build.
npm run lint Runs ESLint across the repository.
npm run audit Runs npm audit against the current lockfile.
npm run audit:fix Applies lockfile-only audit remediations for transitive vulnerabilities.
npm run preview Serves the production build locally with Vite preview.
npm run preview -- --port 4174 Serves the production build on a custom preview port.

Application Overview

Generate

The Generate view supports:

Image model selection between GPT Image 2 and Nano Banana Pro
Mode toggle between Single Shot and Autopilot
Free-form text prompts plus example prompt presets
Goal-to-prompt translation for Autopilot mode
Reasoning model selection between GPT 5.4 and Gemini 2.5 Flash in Autopilot mode
GPT Image 2 quality options: low, medium, high
GPT Image 2 size options: auto, 1024x1024, 1536x1024, 1024x1536
GPT Image 2 background options: auto, opaque, transparent
Nano Banana Pro aspect ratio options: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Nano Banana Pro resolution options: 1K, 2K, 4K
GPT Image 2 batch size options: 1, 2, 3, 4
Nano Banana Pro batch size options: 1, 2, 3, 4
Style, lighting, and palette modifiers that are merged into the request prompt
Configurable Autopilot iteration count from 1 to 8
Configurable Autopilot satisfaction threshold from 50 to 100
Cost disclosure and confirmation before each Autopilot run, including the selected image and reasoning models
Live Autopilot progress, best-iteration highlighting, and pause/cancel support
Multiple reference image uploads through file picker, drag-and-drop, and clipboard paste
Nano Banana Pro reference inputs are capped to the first 14 images for provider compatibility
Reference preview modal with next and previous navigation
Streaming partial-image previews during single-shot generation for models that support partial streaming
A batch result grid for multi-image runs with per-slot save, download, use-as-reference, and isolated per-slot failure reporting
Save All and Clear Results actions for batch runs
Use as Reference to feed a generated result back into the reference set while preserving lineage
Actual parameter panels that surface the revised prompt, size, quality, and elapsed time returned with each result
Save-to-archive, download, and clear-result actions

Prompt-only GPT Image 2 generations use the OpenAI generations endpoint. GPT Image 2 requests with reference images use the OpenAI edits endpoint so the request can include uploaded image inputs. Nano Banana Pro generation and reference-guided generation use Google Gemini generateContent requests with text and inline image parts.

Batch runs request multiple images per generation. Nano Banana Pro fans batch requests out into parallel generateContent calls so a failed slot stays isolated while the rest of the batch succeeds.

Saved Generate results use the provider-run reference image snapshot, so archive metadata and lineage reflect the exact images sent to the model even if the visible reference collection changes later.

Autopilot reuses the current image model settings and provider-used reference snapshot for every iteration, evaluates results against the goal with the selected reasoning model, refines the prompt between iterations, and keeps the best-scoring result as the primary output. Autopilot result slots carry lineage and actual parameter metadata like regular generated results.

Editor

The Editor view supports:

A Konva-backed layered canvas with a locked base layer for the opened archive image
Uploaded raster image layers that become part of the visible composition
Layer selection, multi-selection, rename, visibility, opacity, lock, blend mode, reorder, move up/down, duplicate, and delete actions
Layer blend modes: normal, multiply, screen, overlay, darken, lighten, soft-light, and difference
Drag reordering of non-base layers and locking to protect a layer from transform edits
Direct move, scale, and rotation handles for the primary selected non-base layer
Keyboard nudging of selected layers by 1 pixel, or 10 pixels with Shift
Brightness, contrast, and saturation controls applied live to the full composition
Quick filters: Normal, B&W, Sepia, and Soft, applied live to the canvas
Collapsible Adjustments and Filters sections
AI transformation prompts with selectable image models
AI transform model selection that can inherit the source image model while still allowing an explicit override
AI transforms targeted to selected visible non-base layers, or to the whole visible composition when no editable layer is selected
AI transform requests separate the editable source image, composition context, and optional user reference images before provider mapping
Transform-mask painting with brush and eraser tools and an adjustable brush size for models that support masked edits
AI result layers inserted non-destructively above the targeted layer selection
Optional visual context reference images for edit guidance through file picker, drag-and-drop, or clipboard paste
Unsaved editor drafts persisted per archive image
Undo and redo for layer, adjustment, reference, and AI result changes
Save changes in place
Save as copy
Reset Adjustments and Revert Draft controls

Editor saves are recorded in lineage as overwrite, save-as-copy, manual-edit, or AI-edit steps depending on the action taken. Transform masks used for AI edits are stored with the lineage step and replayed back into the editor when an edit branch is reopened. Layered images keep durable layer stack metadata and per-layer image assets alongside the flattened archive preview.

Editor lineage metadata records the target plan, source image, composition context, reference images, output layer, transform mask, layer stack summary, and save mode needed to summarize or replay an edit branch.

Settings

The Settings view supports:

Local OpenAI API key storage in the browser
Local Google Gemini API key storage in the browser
Saved-key status feedback and masked key entry
Immediate model availability once the matching provider key is stored
A completion notifications toggle that surfaces a desktop notification when a run finishes while the app is in the background
Notification readiness status that reflects unsupported browsers and insecure contexts

The sidebar includes a collapsible navigation rail.

Storage Model

The application is designed as a local-first web app.

Provider API keys are stored in browser localStorage
View state, generation drafts, model-specific generation settings, Autopilot settings, archive search, archive favorites filter, completion notification preference, and editor drafts are stored in browser localStorage
Current generated batch results and transferred reference payloads are stored in IndexedDB via idb-keyval
Archive image metadata is stored in a browser-local SQLite database via SQLocal
Layer stack metadata is stored with archive image metadata in SQLocal
Flattened images, reference images, and per-layer image assets are stored in IndexedDB via idb-keyval
Lineage metadata, including typed Generate, Editor, Autopilot, transform-mask, and actual-parameter metadata, is stored in a browser-local SQLite database via SQLocal
Favorite flags are stored with archive image metadata in SQLocal
Archive ZIP bundles contain image files, reference files, layer asset files, archive-manifest.json, and lineage-manifest.json
Archive import, export, delete, copy, and metadata recovery flows share the same archive manifest and asset ownership language

There is no custom backend service in this repository.

Provider Integration

The app calls provider APIs directly from the browser.

OpenAI image generation uses POST /v1/images/generations
OpenAI reference-based generation and editor transforms use POST /v1/images/edits
OpenAI Autopilot reasoning uses POST /v1/responses
Google image generation and editing use Gemini generateContent
Google Autopilot reasoning uses Gemini generateContent
Image models: gpt-image-2, nano-banana-pro
Reasoning models: gpt-5.4, gemini-2.5-flash
Shared image-model control facts drive UI choices, default values, validation, provider request mapping, reference limits, mask capability, streaming capability, and archive metadata
The app requests between one and four images per generation, fanning Nano Banana Pro batches out into isolated parallel requests
Single-image OpenAI generations can stream partial-image previews when the model supports it
Editor AI transforms can include a painted mask for models that support masked edits
Image responses are consumed as base64 payloads and converted into browser-safe data URLs for preview and persistence
Provider responses report actual generation parameters such as the revised prompt, size, quality, and elapsed time

Additional implementation details live in:

docs/openAI_image_generation.md
docs/openAI_create_image.md

Privacy and Security

The project is designed for local use in the browser
Secrets are not committed to the repository
The repository does not ship with embedded API keys, .env files, or private key material
Sensitive provider request payloads are not logged by the client helpers

If you fork this project, keep the same standard for your own commits and issues.

Project Structure

src/
  app/             App-level controller, notifications, and persisted preferences
  archive/         Archive storage, ZIP export/import helpers, and archive controllers
  autopilot/       Autopilot orchestration and reasoning-model helper modules
  components/      Reusable UI components and modals
  db/              SQLocal bootstrap and persistence types
  download/        Local download helpers for images and ZIP bundles
  editor/          Canvas editing, editor sessions, and save flows
  generate-session Generate draft persistence, save logic, and Autopilot glue
  hooks/           Shared React hooks for local storage and archive state
  image-models/    Image-model control facts, validation, limits, and provider request mapping
  image-workflow/  Provider request orchestration for generate and edit flows
  lineage/         Lineage storage, replay, timelines, and metadata helpers
  references/      Reference image collection state and hydration helpers
  services/        IndexedDB-backed storage adapters
  utils/           Provider model constants, OpenAI helpers, and file conversion helpers
  views/           Generate, Archive, Editor, and Settings views
docs/
  agentic-creative-autopilot-prd.md
  creative-lineage-autopilot-qa-plan.md
  creative-lineage-graph-prd.md
  DESIGN.md
  adr/
  openAI_create_image.md
  openAI_image_generation.md
plans/
  creative-lineage-and-autopilot.md
  layered-editor.md
  localstorage-to-sqlite.md
  telepathic-instruments-rebrand.md

Documentation

CONTEXT.md defines the repo's domain vocabulary for image models, providers, lineage, and layered editor concepts
docs/DESIGN.md defines the Telepathic Instruments-inspired visual design language used by the app
docs/openAI_image_generation.md describes the current provider integration and request routing
docs/openAI_create_image.md maps Generate, Editor, and Autopilot flows to the request payloads used by the app
docs/adr/ captures durable architecture decisions for archive assets, Konva canvas rendering, editor history, copy semantics, AI transform targeting, and layered-image adjustments
docs/creative-lineage-graph-prd.md captures the lineage product requirements
docs/agentic-creative-autopilot-prd.md captures the Autopilot product requirements
docs/creative-lineage-autopilot-qa-plan.md outlines QA coverage for lineage and Autopilot flows
plans/creative-lineage-and-autopilot.md summarizes the implementation plan behind the current lineage and Autopilot architecture
plans/layered-editor.md summarizes the implementation plan behind the current layered editor architecture

License

This project is released under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.claude		.claude
docs		docs
plans		plans
public		public
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTEXT.md		CONTEXT.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
skills-lock.json		skills-lock.json
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AURA AI

Screens

Highlights

Tech Stack

Runtime Requirements

Getting Started

Available Scripts

Script Reference

Application Overview

Generate

Archive

Editor

Settings

Storage Model

Provider Integration

Privacy and Security

Project Structure

Documentation

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AURA AI

Screens

Highlights

Tech Stack

Runtime Requirements

Getting Started

Available Scripts

Script Reference

Application Overview

Generate

Archive

Editor

Settings

Storage Model

Provider Integration

Privacy and Security

Project Structure

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages