diff --git a/USER_ISOLATION_IMPLEMENTATION.md b/USER_ISOLATION_IMPLEMENTATION.md deleted file mode 100644 index 324c40db562..00000000000 --- a/USER_ISOLATION_IMPLEMENTATION.md +++ /dev/null @@ -1,169 +0,0 @@ -# User Isolation Implementation Summary - -This document describes the implementation of user isolation features in the InvokeAI session queue and processing system to address issues identified in the enhancement request. - -## Issues Addressed - -### 1. Cross-User Image/Preview Visibility -**Problem:** When two users are logged in simultaneously and one initiates a generation, the generation preview shows up in both users' browsers and the generated image gets saved to both users' image boards. - -**Solution:** Implemented socket-level event filtering based on user authentication: - -#### Backend Changes (`invokeai/app/api/sockets.py`): -- Added socket authentication middleware in `_handle_connect()` method -- Extracts JWT token from socket auth data or HTTP headers -- Verifies token using existing `verify_token()` function -- Stores `user_id` and `is_admin` in socket session for later use -- Modified `_handle_queue_event()` to filter events by user: - - For `QueueItemEventBase` events, only emit to: - - The user who owns the queue item (`user_id` matches) - - Admin users (`is_admin` is True) - - For general queue events, emit to all subscribers - -#### Event System Changes (`invokeai/app/services/events/events_common.py`): -- Added `user_id` field to `QueueItemEventBase` class -- Updated all event builders to include `user_id` from queue items: - - `InvocationStartedEvent.build()` - - `InvocationProgressEvent.build()` - - `InvocationCompleteEvent.build()` - - `InvocationErrorEvent.build()` - - `QueueItemStatusChangedEvent.build()` - -### 2. Batch Field Values Privacy -**Problem:** Users can see batch field values from generation processes launched by other users. - -**Solution:** Implemented field value sanitization at the API level: - -#### API Router Changes (`invokeai/app/api/routers/session_queue.py`): -- Created `sanitize_queue_item_for_user()` helper function - - Clears `field_values` for non-admin users viewing other users' items - - Admins and item owners can see all field values -- Updated endpoints to require authentication and sanitize responses: - - `list_all_queue_items()` - Added `CurrentUser` dependency - - `get_queue_items_by_item_ids()` - Added `CurrentUser` dependency - - `get_queue_item()` - Added `CurrentUser` dependency - -### 3. Queue Updates Across Browser Windows -**Problem:** When the job queue tab is open in multiple browsers and a generation is begun in one browser window, the queue does not update in the other window. - -**Status:** This issue is likely resolved by the socket authentication and event filtering changes. The existing socket subscription mechanism (`subscribe_queue` event) already supports multiple connections per user. Testing is required to confirm this works correctly with the new authentication flow. - -### 4. User Information Display -**Problem:** Queue table lacks user identification, making it difficult to know who launched which job. - -**Solution:** Added user information to queue items and UI: - -#### Database Layer (`invokeai/app/services/session_queue/session_queue_sqlite.py`): -- Updated SQL queries to JOIN with `users` table -- Modified methods to fetch user information: - - `get_queue_item()` - Now selects `display_name` and `email` from users table - - `dequeue()` - Includes user info - - `get_next()` - Includes user info - - `get_current()` - Includes user info - - `list_all_queue_items()` - Includes user info - -#### Data Model Changes (`invokeai/app/services/session_queue/session_queue_common.py`): -- Added optional fields to `SessionQueueItem`: - - `user_display_name: Optional[str]` - Display name from users table - - `user_email: Optional[str]` - Email from users table - - Note: `user_id` field already existed from Migration 25 - -#### Frontend UI Changes: -- **Constants** (`constants.ts`): Added `user: '8rem'` column width -- **Header** (`QueueListHeader.tsx`): Added "User" column header -- **Item Component** (`QueueItemComponent.tsx`): - - Added logic to display user information (display_name → email → user_id) - - Added user column to queue item row - - Added tooltip with full username on hover - - Added "Hidden for privacy" message when field_values are null for non-owned items -- **Localization** (`en.json`): Added translations: - - `"user": "User"` - - `"fieldValuesHidden": "Hidden for privacy"` - -## Security Considerations - -### Token Verification -- Tokens are verified using the existing `verify_token()` function from `invokeai.app.services.auth.token_service` -- Invalid or missing tokens default to "system" user with non-admin privileges -- Socket connections without valid tokens are still accepted for backward compatibility but have limited access - -### Data Privacy -- Field values are only visible to: - - The user who created the queue item - - Admin users -- Non-admin users viewing other users' queue items see "Hidden for privacy" instead of field values - -### Admin Privileges -- Admin users can see all queue events and field values across all users -- Admin status is determined from the JWT token's `is_admin` field - -## Migration Notes - -No database migration is required. The changes leverage: -- Existing `user_id` column in `session_queue` table (added in Migration 25) -- Existing `users` table (added in Migration 25) -- SQL LEFT JOINs to fetch user information (gracefully handles missing user records) - -## Testing Requirements - -### Backend Testing -1. **Socket Authentication:** - - Verify valid tokens are accepted and user context is stored - - Verify invalid tokens default to system user - - Verify expired tokens are rejected - -2. **Event Filtering:** - - User A should only receive events for their own queue items - - Admin users should receive all events - - Non-admin users should not receive events from other users - -3. **Field Value Sanitization:** - - Non-admin users should see null field_values for other users' items - - Admins should see all field values - - Users should see their own field values - -### Frontend Testing -1. **UI Display:** - - User column should display in queue list - - Display name should be shown when available - - Email should be shown as fallback when display name is missing - - User ID should be shown when both display name and email are missing - - Tooltip should show full username on hover - -2. **Field Values Display:** - - "Hidden for privacy" message should appear when viewing other users' items - - Own items should show field values normally - -3. **Multi-Browser Testing:** - - Open queue tab in two browsers with different users - - Start generation in one browser - - Verify other browser doesn't see the preview/progress - - Verify admin user can see all generations - -### Integration Testing -1. Multi-user scenarios with simultaneous generations -2. Queue updates across multiple browser windows -3. Admin vs. non-admin privilege differentiation -4. Socket reconnection handling - -## Known Limitations - -1. **TypeScript Types:** - - The OpenAPI schema needs to be regenerated to include new fields - - Run: `cd invokeai/frontend/web && python ../../../scripts/generate_openapi_schema.py | pnpm typegen` - -2. **Backward Compatibility:** - - System user ("system") entries will not have display name or email - - Existing queue items from before Migration 25 will have user_id="system" - -3. **Socket.IO Session Storage:** - - Socket.IO's in-memory session storage may not persist across server restarts - - Consider implementing persistent session storage if needed for production - -## Future Enhancements - -1. Add user filtering to queue list (show only my items vs. all items) -2. Add permission system for queue management operations (cancel, retry, delete) -3. Implement queue item ownership transfer for administrative purposes -4. Add audit logging for queue operations with user attribution -5. Consider implementing user-specific queue limits or quotas diff --git a/docs/src/content/docs/features/gallery.mdx b/docs/src/content/docs/features/gallery.mdx index 1d490b33d76..6b3ad2e47cb 100644 --- a/docs/src/content/docs/features/gallery.mdx +++ b/docs/src/content/docs/features/gallery.mdx @@ -1,6 +1,6 @@ --- title: Gallery Panel -description: Learn how to manage, organize, and use your generated images and assets with the Gallery Panel in InvokeAI. +description: Learn how to manage, organize, and use your generated images, videos, and assets with the Gallery Panel in InvokeAI. lastUpdated: 2026-02-19 sidebar: order: 1 @@ -8,7 +8,7 @@ sidebar: import { Card, CardGrid, Steps } from '@astrojs/starlight/components'; -The Gallery Panel is a fast way to review, find, and make use of images you've generated and loaded. The Gallery is divided into **Boards**. The *Uncategorized* board is always present, but you can create your own for better organization. +The Gallery Panel is a fast way to review, find, and make use of images and videos you've generated and loaded. The Gallery is divided into **Boards**. The *Uncategorized* board is always present, but you can create your own for better organization. Boards are polymorphic — images and videos coexist on the same board and appear together in the gallery, sorted by creation time. ![Gallery Panel Overview](./assets/gallery.png) @@ -51,10 +51,10 @@ Each board has a context menu accessible via right-click (or Ctrl+click). - **Auto-add to this Board:** If *Auto-Assign Board on Click* is disabled in settings, use this option to quickly set the selected board as the default destination for new images. - **Download Board:** Packages all images within the board into a `.zip` file. A notification link will be provided when the download is ready. -- **Delete Board:** Permanently removes the board and all of its contents. +- **Delete Board:** Permanently removes the board and all of its contents — both images **and** videos. :::danger -Deleting a board will **permanently delete all images** contained within it. Proceed with caution! +Deleting a board will **permanently delete all images and videos** contained within it. Proceed with caution! ::: ### Board Contents @@ -130,6 +130,27 @@ Additionally, each image has a context menu (right-click or Ctrl+click) with pow --- +## Videos in the Gallery + +Videos generated by InvokeAI (currently from the Wan 2.2 model family) appear alongside images in the same gallery view. Each video item displays a first-frame still as its thumbnail with a play badge in the corner; selecting it opens the video in the viewer where you can play it back inline. + +### Uploading Videos + +You can upload existing videos to a board via the standard drop-or-upload affordance. The upload pipeline accepts **MP4 files only**. Other containers (`.mov`, `.webm`, `.mkv`) are not transcoded on upload and are rejected at the API boundary — re-encode them to MP4 (for example with `ffmpeg -i input.mov -c:v libx264 output.mp4`) before uploading. + +### Video Context Menu + +Each video has a context menu with the same organization actions as images, plus video-appropriate variants: + +- **Open in New Tab / Download:** Opens or saves the raw MP4 file. +- **Star Video:** Pins the video to the top of the gallery. +- **Change Board:** Moves the video to a different board. *(Drag-and-drop onto board thumbnails also works.)* +- **Delete Video:** Permanently deletes the video and its thumbnail. + +Videos count toward board contents: a board with two images and three videos shows five items in the polymorphic gallery list and reports both totals in its stats. + +--- + ## Summary This walkthrough covers the Gallery interface and Boards. For guidance on prompting and generation workflows, please refer to the [Prompting Guide](/concepts/prompting-guide/) and [AI Image Generation](/concepts/image-generation/). diff --git a/docs/src/content/docs/features/video-generation.mdx b/docs/src/content/docs/features/video-generation.mdx new file mode 100644 index 00000000000..3e3d31104a3 --- /dev/null +++ b/docs/src/content/docs/features/video-generation.mdx @@ -0,0 +1,251 @@ +--- +title: Video Generation (experimental) +description: Generate short videos with the Wan 2.2 model family — text-to-video, image-to-video, and the trick for stitching longer sequences. +lastUpdated: 2026-05-13 +--- + +import { Card, CardGrid, Steps } from '@astrojs/starlight/components'; + +InvokeAI ships **experimental support for the Wan 2.2 model family**, which lets you generate short MP4 clips from a text prompt, an image, or both. Output ranges from a few-second loop (the model's training distribution) up to longer sequences assembled with the [concat trick](#making-longer-videos) below. + +:::caution[Experimental] +Video generation is a prototype feature. Workflows, node fields, and starter-model packaging may change between releases. The underlying models are also new — expect rough edges in coherence and artifacts at longer durations. +::: + +--- + +## Models + +Wan 2.2 ships three transformer variants, plus a shared text encoder and two VAEs. All share the same diffusion-style sampling but differ in size, conditioning, and intended task. + +### The variants + +| Variant | Task | Params | VAE | Conditioning | +|---|---|---|---|---| +| **T2V-A14B** | Text → Video | 14B × 2 experts | A14B VAE (16-ch, 8× spatial) | Text only | +| **I2V-A14B** | Image+Text → Video | 14B × 2 experts | A14B VAE (16-ch, 8× spatial) | Text + reference image (36-channel concat) | +| **TI2V-5B** | Text → Video OR Image+Text → Video | 5B (single) | Wan 2.2-VAE (48-ch, 16× spatial) | Text, optionally with reference image (first-frame mask blend) | + +* **T2V-A14B** generates videos from a text prompt alone. Best motion coherence and prompt-following of the three. +* **I2V-A14B** locks the first frame to a reference image you supply. Best subject-preservation; the image is concatenated to the noise latents at every step so the model "sees" the reference throughout denoising. +* **TI2V-5B** is the small single-expert variant. It can do **both** text-to-video and image-to-video with the same checkpoint, at substantially lower VRAM, but with somewhat less stable long-range coherence than the A14B variants. + +### High-noise and low-noise transformers (A14B variants only) + +The A14B models are a **mixture-of-experts (MoE)** pair. There are actually *two* 14B transformers on disk per variant — a "high-noise" expert and a "low-noise" expert — and the denoise loop swaps between them at a model-defined boundary timestep: + +* **High-noise expert** runs early in denoising, when the latents are still mostly noise. It's responsible for composition, layout, and broad motion. +* **Low-noise expert** runs later, when the latents are close to clean. It refines detail and texture. + +InvokeAI handles the swap automatically — both experts have to be installed (the starter bundle handles this), but the workflow only references the "high-noise" model as the main and the "low-noise" model is wired alongside it via the loader node. You don't manage the boundary yourself. + +**TI2V-5B is single-expert** — no swap, no boundary, just one model that runs every step. Workflows for TI2V-5B are correspondingly simpler. + +### Lightning LoRAs (4-step inference for A14B) + +The default A14B variants need ~40–50 denoise steps for clean output. The Wan team also released **Lightning distillation LoRAs** that collapse that to 4 steps with minimal quality loss — about a 10× speedup. There's a pair per variant (one LoRA for the high-noise expert, one for the low-noise), wired through the LoRA loader nodes in the starter workflows. + +:::tip +**TI2V-5B doesn't have a Lightning LoRA.** Its smaller size means each step is cheap; you typically run it at 40–50 steps and end up in a similar wall-clock ballpark as A14B + Lightning. +::: + +### Installing models + +The model manager ships two **starter bundles** for video work: + +* **Wan 2.2 Text-to-Video** (~36 GB) — UMT5-XXL text encoder, both VAEs, TI2V-5B Q4_K_M, T2V-A14B Q4_K_M (high + low), T2V Lightning (high + low). +* **Wan 2.2 Image-to-Video** (~32 GB) — UMT5-XXL, A14B VAE, I2V-A14B Q4_K_M (high + low), I2V Lightning (high + low). + +The bundles are independent. Installing both ends up at ~56 GB total (shared components — UMT5-XXL and the A14B VAE — are deduplicated on the second install). A 12 GB VRAM card can install only the Text-to-Video bundle and have **TI2V-5B available for both T2V and image-to-video** without ever touching the I2V bundle. + +Higher-quality Q8_0 quantizations of every transformer, plus full Diffusers builds of all three variants, are available as a-la-carte installs in the starter models list. + +:::tip +**On a 12 GB VRAM card**: install just the Text-to-Video bundle and use TI2V-5B. The A14B variants will technically run via aggressive offloading but are slow and prone to OOM. TI2V-5B Q4_K_M fits comfortably and is what we recommend for that tier. +::: + +--- + +## Workflow setup + +The shipped starter workflows ("Text to Video - Wan 2.2 Lightning", "Image to Video - Wan 2.2 Lightning") are the easiest starting point — load them from the workflow library, pick your models, set a prompt, and Invoke. The sections below describe what's happening inside so you can build your own. + +### Constraints that apply to every video workflow + +**Frame count**: `num_frames - 1` must be divisible by **4**. This is dictated by the Wan VAE's temporal compression (4 pixel-frames → 1 latent-frame). Valid values: 5, 9, 13, … **81** (the training default, 5 seconds at 16 fps), 85, 89, etc. + +**Pixel dimensions**: must be a multiple of **16** for T2V-A14B and I2V-A14B, and a multiple of **32** for TI2V-5B. The constraint comes from the VAE's spatial downsample × the transformer's 2×2 patch size: + +| Variant | VAE spatial | Pixel multiple of | +|---|---|---| +| T2V-A14B, I2V-A14B | 8× | **16** | +| TI2V-5B | 16× | **32** | + +Reference values that work: 832×480 (480p), 1280×720 (720p, A14B only — TI2V-5B needs 1280×704 instead since 720 isn't divisible by 32). + +**Encoder and denoise dimensions must match**: the `Reference Image - Wan 2.2` encoder and the `Denoise Video - Wan 2.2` node both have their own `width` and `height` fields. They have to be identical or the denoise loop will reject the condition tensor. + +:::tip[Use Wan 2.2 I2V Ideal Dimensions] +The **Wan 2.2 I2V Ideal Dimensions** node takes a source image's W×H and a target preset (480p / 720p / 1080p) and outputs valid (width, height) for the encoder + denoise inputs. Wire it once and feed its outputs into both nodes. Saves the manual snap-to-16/snap-to-32 math. +::: + +### Text-to-video workflow + +The minimum node chain for T2V: + +``` +Wan Main Model Loader ──┐ + │ +Wan T5 Text Encoder ────┤ + ▼ +Wan Compel Conditioning (positive) + │ + ▼ + Denoise Video - Wan 2.2 ──→ Latents to Video - Wan 2.2 ──→ MP4 + ▲ +Wan Compel Conditioning (negative) ─┘ +``` + +For **TI2V-5B T2V** this is the entire graph — load the TI2V-5B model and the TI2V-5B VAE, set width/height/num_frames, and run. + +For **T2V-A14B** the main model loader also exposes the low-noise expert slot, and you typically add the **Lightning LoRA pair** (one for each expert) to bring step count down to 4. Recommended: + +* Steps: **4** (with Lightning) or **40–50** (without) +* CFG: **5.0** high-noise / **4.0** low-noise (the dedicated `Guidance Scale (Low Noise)` field on the denoise node) +* Width × Height: 832×480 (faster, default) or 1280×720 (sharper, 4× the memory) + +### Image-to-video workflow + +I2V adds a **Reference Image** branch alongside the denoise. The reference image gets VAE-encoded into a conditioning tensor that the denoise loop uses to anchor the video's content: + +``` +Wan Main Model Loader ──┐ +Wan T5 Text Encoder ────┤ +Wan Compel Conditioning ┤ + │ +Image Primitive ──→ Reference Image - Wan 2.2 ──┐ + │ │ + ▼ ▼ + Denoise Video - Wan 2.2 ──→ Latents to Video - Wan 2.2 ──→ MP4 +``` + +For **I2V-A14B**, both the reference encoder and the denoise node need to use the same width/height. The encoder also takes a `num_frames` parameter that must match the denoise's `num_frames` — set both to 81 by default. + +For **TI2V-5B image-to-video**, the conditioning math is different (the model uses a first-frame-mask blend rather than channel concatenation), but the workflow shape is the same. The encoder auto-detects TI2V-5B from the VAE's 48 latent channels and emits the right condition tensor. + +:::caution[TI2V-5B I2V dimensional constraint] +TI2V-5B image-to-video requires **width and height divisible by 32** (not just 16). The encoder will refuse the workflow with a clear error if not. 832×480 works; 1280×720 does not (720 is not divisible by 32). Use 1280×704 for 720p-ish on TI2V-5B. +::: + +### Recommended starting parameters + +| | T2V-A14B + Lightning | T2V-A14B | I2V-A14B + Lightning | TI2V-5B (T2V or I2V) | +|---|---|---|---|---| +| Steps | 4 | 40–50 | 4 | 40–50 | +| CFG (high) | 1.0 | 5.0 | 1.0 | 5.0–5.5 | +| CFG (low) | 1.0 | 4.0 | 1.0 | n/a (single expert) | +| Num frames | 81 | 81 | 81 | 81 | +| Width × Height | 832×480 | 832×480 | 832×480 | 832×480 | +| Scheduler | Auto (FlowMatchEuler) | Auto | Auto | Auto (UniPC) | + +--- + +## Making longer videos + +The Wan 2.2 models were trained on **81-frame** clips (5 seconds at 16 fps). Outputs much longer than that suffer rapidly degrading coherence — the temporal positional encoding goes out of distribution and the model loses track of scene content. So instead of asking for `num_frames=200`, the recommended pattern is **chaining**: render a sequence of 81-frame clips where each one's first frame matches the previous clip's last frame, then concatenate them with the `Concatenate Videos` node. + +### The basic chain + + + +1. **Render the first clip** with I2V or T2V, ending on whatever subject/scene you want to continue. + +2. **Extract the last frame** of clip 1 using the `Frame from Video` node. Use `frame_index = -1` for the literal last frame, or `-3` / `-5` to step back a few frames (last frames sometimes have boundary artifacts — see the [troubleshooting](#late-frame-artifacts-text-color-blobs) note). + +3. **Feed that frame as the reference image** for an I2V run that becomes clip 2. Adjust the prompt for whatever motion you want next. + +4. **Repeat** as many times as you want clips. + +5. **Concatenate** all the clips into a single MP4 with `Concatenate Videos`. Pick a transition mode based on whether you want a seamless join (`cut` if the bridge frame matches perfectly), a smooth blend (`crossfade`), or a punctuated scene change (`fade_through_black`). + + + +### Transition modes + +The `Concatenate Videos` node offers three: + +* **`cut`** — hard splice. Fastest. Total length = sum of inputs. Use this when the bridge frame is genuinely shared (clip 2's first frame = clip 1's last frame) — the seam is invisible. +* **`crossfade`** — linear A→B dissolve over `transition_frames`. Consumes `transition_frames` from both sides of each boundary. Total length = `sum(inputs) - transition_frames × (n-1)`. Use this when bridge frames don't quite match. +* **`fade_through_black`** — A fades to black, then B fades in from black. Total length is preserved. Use this for explicit scene changes. + +### Quality degradation across iterations + +A real failure mode of long chains: each iteration's reference image is itself a *generation output*, so artifacts compound. The model treats codec artifacts and VAE softness in the bridge frame as "style" and reproduces them in the next clip. By the 4th or 5th iteration you can see noticeable softening or color drift. + +**Mitigations**: + +1. **Pick a bridge frame a few back from the end** (e.g., `frame_index = -3` or `-5`). The very last frame is often the worst frame of a clip due to boundary effects in the temporal attention. +2. **Refresh the bridge frame with a low-strength img2img pass** before feeding it into the next I2V. An SDXL or FLUX img2img at strength ~0.2 with a quality-focused negative prompt (`blurry, low quality, compression artifacts`) noticeably suppresses the cumulative drift. +3. **Don't chain more than 4–5 clips** unless you're explicitly doing img2img refinement between each. + +--- + +## Troubleshooting + +### OOM errors + +Video denoise is memory-intensive — attention scales roughly as `(T_lat × H/16 × W/16)²`, so resolution and frame count both quadratically affect peak VRAM. + +* **Drop resolution before frame count.** Going from 1280×720 to 832×480 is a ~2.4× memory drop and visually subtle in most content. Going from 81 frames to 65 only saves ~20%. +* **TI2V-5B before A14B.** TI2V-5B Q4_K_M peaks around ~6–8 GB at 832×480, versus ~12–14 GB for A14B Q4_K_M. If you're at the OOM edge, switch model family. +* **OOM at the *reference image encoder* step** is usually allocator fragmentation from a previous run rather than absolute memory pressure. Restart the dev server and try again; if it recurs reproducibly, file an issue. + +### Late-frame artifacts (text, color blobs) + +If your video looks great for most of its duration but the last ~20% develops Asian text, watermarks, or floating colored shapes, **that's the model's training-data prior leaking through** as temporal coherence weakens at long temporal distance. It's particularly common on TI2V-5B (smaller model, less capacity to hold scene). + +**Mitigations**: + +* Add to the negative prompt: `text, watermark, logo, subtitles, chinese characters, kanji, ticker, banner` +* Use a more specific prompt — describe the action you want to *happen* through the clip, not just the static scene +* Bump CFG to 5.5 (TI2V-5B tolerates this) +* Stay at `num_frames=81`; values above push temporal RoPE out of distribution and artifacts accelerate + +### "Dimensions must be multiples of 16/32" + +The encoder and denoise nodes enforce these at runtime. Either: + +* Use the **Wan 2.2 I2V Ideal Dimensions** node to compute valid (W, H) automatically from a source image, or +* Manually round to the right multiple (16 for A14B variants, 32 for TI2V-5B) + +### Reference image / denoise dimension mismatch + +If the denoise refuses with `Reference-image dimensions … must match denoise dimensions`, both nodes have their own width/height fields and they need to agree. Wire the same values (or the same Ideal Dimensions output) into both. + +### TI2V-5B VAE state-dict load error + +If `Latents to Video - Wan 2.2` fails with `Error(s) in loading state_dict for AutoencoderKLWan: ... size mismatch for ...`, you have the **wrong VAE installed** for the chosen transformer. TI2V-5B needs the **Wan 2.2 TI2V-5B VAE** (48-channel, Wan 2.2-VAE), not the A14B VAE (16-channel). Both are in the model manager — check the VAE field on the loader and the latents-to-video node. + +### Sampler drift on the standalone TI2V-5B GGUF + +Standalone GGUF installs don't ship a `scheduler/` config directory. InvokeAI now defaults to `UniPCMultistepScheduler` with the correct Wan-flow params when the model is TI2V-5B and there's no on-disk scheduler — but if you have an older install behaving oddly, the safer alternative is the **full Diffusers TI2V-5B** install (which includes the scheduler config). + +### Preview images don't appear + +Two known causes: + +1. **A video is already loaded in the viewer.** If the last-selected gallery item is a video, the viewer renders the video element. The progress preview overlays on top of it. If you don't see it at all, **hard-refresh the browser** (`Ctrl+Shift+R` / `Cmd+Shift+R`) — Vite's bundle cache occasionally serves a stale build. +2. **`Show progress in viewer` is disabled.** Check the gallery settings (gear icon at the top of the gallery panel). + +### Pipeline runs but the final MP4 is glitchy + +This is almost always a **VAE mismatch** or a **scheduler mismatch** — both surface as garbage at the very end of the pipeline. Check that: + +* The VAE matches the transformer family (16-ch for A14B, 48-ch for TI2V-5B) +* You're using the default (auto-selected) scheduler — manually overriding it is currently not supported + +--- + +## Acknowledgements + +Wan 2.2 model family by the Alibaba Wan-AI team. Lightning distillation LoRAs by [lightx2v](https://huggingface.co/lightx2v/Wan2.2-Lightning). GGUF quantizations by [QuantStack](https://huggingface.co/QuantStack). diff --git a/docs/src/content/docs/start-here/system-requirements.mdx b/docs/src/content/docs/start-here/system-requirements.mdx index 114698ce158..5eff2bc427a 100644 --- a/docs/src/content/docs/start-here/system-requirements.mdx +++ b/docs/src/content/docs/start-here/system-requirements.mdx @@ -2,7 +2,7 @@ title: Hardware Requirements sidebar: order: 1 -lastUpdated: 2026-02-18 +lastUpdated: 2026-05-11 --- import { Tabs, TabItem, Steps } from '@astrojs/starlight/components' @@ -28,6 +28,8 @@ The requirements below are rough guidelines for best performance. GPUs with less | FLUX.2 Klein 4B | 1024x1024 | Nvidia 30xx+ | 12GB | 16GB | FP8 works with 8GB+; Diffusers + encoder | | FLUX.2 Klein 9B | 1024x1024 | Nvidia 40xx | 24GB | 32GB | FP8 works with 12GB+; Diffusers + encoder | | Z-Image Turbo | 1024x1024 | Nvidia 20xx+ | 8GB | 16GB | Q4_K 8GB; Q8/BF16 16GB+ | +| Wan 2.2 A14B (T2V/I2V) | 1280x720 | Nvidia 30xx+ | 12GB | 32GB | Dual-expert MoE; Q4_K_M 12GB; Q8 18GB+; Diffusers requires 32GB+ | +| Wan 2.2 TI2V-5B | 1280x720 | Nvidia 20xx+ | 8GB | 16GB | Single transformer; Q4_K_M 6GB+; Q8 8GB+; Diffusers 12GB+ | :::tip[`tmpfs` on Linux] If your temporary directory is mounted as a `tmpfs`, ensure it has sufficient space. diff --git a/invokeai/app/api/dependencies.py b/invokeai/app/api/dependencies.py index e7468c1bca4..9cee3cc637d 100644 --- a/invokeai/app/api/dependencies.py +++ b/invokeai/app/api/dependencies.py @@ -10,6 +10,7 @@ from invokeai.app.services.board_image_records.board_image_records_sqlite import SqliteBoardImageRecordStorage from invokeai.app.services.board_images.board_images_default import BoardImagesService from invokeai.app.services.board_records.board_records_sqlite import SqliteBoardRecordStorage +from invokeai.app.services.board_video_records.board_video_records_sqlite import SqliteBoardVideoRecordStorage from invokeai.app.services.boards.boards_default import BoardService from invokeai.app.services.bulk_download.bulk_download_default import BulkDownloadService from invokeai.app.services.client_state_persistence.client_state_persistence_sqlite import ClientStatePersistenceSqlite @@ -24,6 +25,7 @@ SeedreamProvider, ) from invokeai.app.services.external_generation.startup import sync_configured_external_starter_models +from invokeai.app.services.gallery.gallery_default import SqliteGalleryService from invokeai.app.services.image_files.image_files_disk import DiskImageFileStorage from invokeai.app.services.image_records.image_records_sqlite import SqliteImageRecordStorage from invokeai.app.services.images.images_default import ImageService @@ -51,6 +53,9 @@ from invokeai.app.services.style_preset_records.style_preset_records_sqlite import SqliteStylePresetRecordsStorage from invokeai.app.services.urls.urls_default import LocalUrlService from invokeai.app.services.users.users_default import UserService +from invokeai.app.services.video_files.video_files_disk import DiskVideoFileStorage +from invokeai.app.services.video_records.video_records_sqlite import SqliteVideoRecordStorage +from invokeai.app.services.videos.videos_default import VideoService from invokeai.app.services.workflow_records.workflow_records_sqlite import SqliteWorkflowRecordsStorage from invokeai.app.services.workflow_thumbnails.workflow_thumbnails_disk import WorkflowThumbnailFileStorageDisk from invokeai.backend.stable_diffusion.diffusion.conditioning_data import ( @@ -62,6 +67,7 @@ QwenImageConditioningInfo, SD3ConditioningInfo, SDXLConditioningInfo, + WanConditioningInfo, ZImageConditioningInfo, ) from invokeai.backend.util.logging import InvokeAILogger @@ -107,6 +113,7 @@ def initialize( raise ValueError("Output folder is not set") image_files = DiskImageFileStorage(f"{output_folder}/images") + video_files = DiskVideoFileStorage(f"{output_folder}/videos") model_images_folder = config.models_path style_presets_folder = config.style_presets_path @@ -131,6 +138,10 @@ def initialize( bulk_download = BulkDownloadService() image_records = SqliteImageRecordStorage(db=db) images = ImageService() + video_records = SqliteVideoRecordStorage(db=db) + videos = VideoService() + board_video_records = SqliteBoardVideoRecordStorage(db=db) + gallery = SqliteGalleryService(db=db) invocation_cache = MemoryInvocationCache(max_cache_size=config.node_cache_size) tensors = ObjectSerializerForwardCache( ObjectSerializerDisk[torch.Tensor]( @@ -152,6 +163,7 @@ def initialize( ZImageConditioningInfo, QwenImageConditioningInfo, AnimaConditioningInfo, + WanConditioningInfo, ], ephemeral=True, ), @@ -221,6 +233,11 @@ def initialize( workflow_thumbnails=workflow_thumbnails, client_state_persistence=client_state_persistence, users=users, + videos=videos, + video_files=video_files, + video_records=video_records, + board_video_records=board_video_records, + gallery=gallery, ) ApiDependencies.invoker = Invoker(services) diff --git a/invokeai/app/api/routers/boards.py b/invokeai/app/api/routers/boards.py index 6897e90aff4..adba3cb672b 100644 --- a/invokeai/app/api/routers/boards.py +++ b/invokeai/app/api/routers/boards.py @@ -21,6 +21,14 @@ class DeleteBoardResult(BaseModel): description="The image names of the board-images relationships that were deleted." ) deleted_images: list[str] = Field(description="The names of the images that were deleted.") + deleted_board_videos: list[str] = Field( + default_factory=list, + description="The video names of the board-videos relationships that were deleted.", + ) + deleted_videos: list[str] = Field( + default_factory=list, + description="The names of the videos that were deleted.", + ) @boards_router.post( @@ -116,19 +124,34 @@ async def delete_board( if not current_user.is_admin and board.user_id != current_user.user_id: raise HTTPException(status_code=403, detail="Not authorized to delete this board") + # Admins delete everything on the board; regular owners only delete their own + # contributions so that contributions from other users to a public/shared board + # are preserved (they cascade to "uncategorized" via FK on board_videos / board_images). + cascade_user_id: Optional[str] = None if current_user.is_admin else current_user.user_id + try: if include_images is True: deleted_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board( board_id=board_id, categories=None, is_intermediate=None, + user_id=cascade_user_id, + ) + deleted_videos = ApiDependencies.invoker.services.board_video_records.get_all_board_video_names_for_board( + board_id=board_id, + categories=None, + is_intermediate=None, + user_id=cascade_user_id, ) - ApiDependencies.invoker.services.images.delete_images_on_board(board_id=board_id) + ApiDependencies.invoker.services.images.delete_images_on_board(board_id=board_id, user_id=cascade_user_id) + ApiDependencies.invoker.services.videos.delete_videos_on_board(board_id=board_id, user_id=cascade_user_id) ApiDependencies.invoker.services.boards.delete(board_id=board_id) return DeleteBoardResult( board_id=board_id, deleted_board_images=[], deleted_images=deleted_images, + deleted_board_videos=[], + deleted_videos=deleted_videos, ) else: deleted_board_images = ApiDependencies.invoker.services.board_images.get_all_board_image_names_for_board( @@ -136,11 +159,20 @@ async def delete_board( categories=None, is_intermediate=None, ) + deleted_board_videos = ( + ApiDependencies.invoker.services.board_video_records.get_all_board_video_names_for_board( + board_id=board_id, + categories=None, + is_intermediate=None, + ) + ) ApiDependencies.invoker.services.boards.delete(board_id=board_id) return DeleteBoardResult( board_id=board_id, deleted_board_images=deleted_board_images, deleted_images=[], + deleted_board_videos=deleted_board_videos, + deleted_videos=[], ) except Exception: raise HTTPException(status_code=500, detail="Failed to delete board") diff --git a/invokeai/app/api/routers/gallery.py b/invokeai/app/api/routers/gallery.py new file mode 100644 index 00000000000..4ffa5238802 --- /dev/null +++ b/invokeai/app/api/routers/gallery.py @@ -0,0 +1,97 @@ +from typing import Optional + +from fastapi import HTTPException, Query +from fastapi.routing import APIRouter + +from invokeai.app.api.auth_dependencies import CurrentUserOrDefault +from invokeai.app.api.dependencies import ApiDependencies +from invokeai.app.api.routers.images import _assert_board_read_access +from invokeai.app.services.gallery.gallery_common import GalleryItem, GalleryItemNamesResult +from invokeai.app.services.image_records.image_records_common import ImageCategory, ResourceOrigin +from invokeai.app.services.shared.pagination import OffsetPaginatedResults +from invokeai.app.services.shared.sqlite.sqlite_common import SQLiteDirection + +gallery_router = APIRouter(prefix="/v1/gallery", tags=["gallery"]) + + +@gallery_router.get( + "/items/", + operation_id="list_gallery_items", + response_model=OffsetPaginatedResults[GalleryItem], +) +async def list_gallery_items( + current_user: CurrentUserOrDefault, + origin: Optional[ResourceOrigin] = Query(default=None, description="The origin of items to list."), + categories: Optional[list[ImageCategory]] = Query( + default=None, + description="The categories to include. Shared between images and videos.", + ), + is_intermediate: Optional[bool] = Query(default=None, description="Whether to list intermediate items."), + board_id: Optional[str] = Query( + default=None, + description="The board id to filter by. Use 'none' to find items without a board.", + ), + offset: int = Query(default=0, description="The page offset"), + limit: int = Query(default=10, description="The number of items per page"), + order_dir: SQLiteDirection = Query(default=SQLiteDirection.Descending, description="The order of sort"), + starred_first: bool = Query(default=True, description="Whether to sort by starred items first"), + search_term: Optional[str] = Query(default=None, description="The term to search for"), +) -> OffsetPaginatedResults[GalleryItem]: + """Returns a paginated, time-sorted stream of polymorphic gallery items (images + videos).""" + if board_id is not None and board_id != "none": + _assert_board_read_access(board_id, current_user) + + return ApiDependencies.invoker.services.gallery.list_items( + offset=offset, + limit=limit, + starred_first=starred_first, + order_dir=order_dir, + origin=origin, + categories=categories, + is_intermediate=is_intermediate, + board_id=board_id, + search_term=search_term, + user_id=current_user.user_id, + is_admin=current_user.is_admin, + ) + + +@gallery_router.get( + "/items/names", + operation_id="get_gallery_item_names", + response_model=GalleryItemNamesResult, +) +async def get_gallery_item_names( + current_user: CurrentUserOrDefault, + origin: Optional[ResourceOrigin] = Query(default=None, description="The origin of items to list."), + categories: Optional[list[ImageCategory]] = Query( + default=None, + description="The categories to include. Shared between images and videos.", + ), + is_intermediate: Optional[bool] = Query(default=None, description="Whether to list intermediate items."), + board_id: Optional[str] = Query( + default=None, + description="The board id to filter by. Use 'none' to find items without a board.", + ), + order_dir: SQLiteDirection = Query(default=SQLiteDirection.Descending, description="The order of sort"), + starred_first: bool = Query(default=True, description="Whether to sort by starred items first"), + search_term: Optional[str] = Query(default=None, description="The term to search for"), +) -> GalleryItemNamesResult: + """Returns an ordered (kind, name) list — used to drive virtualized gallery selection.""" + if board_id is not None and board_id != "none": + _assert_board_read_access(board_id, current_user) + + try: + return ApiDependencies.invoker.services.gallery.list_item_names( + starred_first=starred_first, + order_dir=order_dir, + origin=origin, + categories=categories, + is_intermediate=is_intermediate, + board_id=board_id, + search_term=search_term, + user_id=current_user.user_id, + is_admin=current_user.is_admin, + ) + except Exception: + raise HTTPException(status_code=500, detail="Failed to get gallery item names") diff --git a/invokeai/app/api/routers/images.py b/invokeai/app/api/routers/images.py index a3ae6fce82b..6f763b20230 100644 --- a/invokeai/app/api/routers/images.py +++ b/invokeai/app/api/routers/images.py @@ -551,6 +551,10 @@ async def delete_images_from_list( image_names: list[str] = Body(description="The list of names of images to delete", embed=True), ) -> DeleteImagesResult: try: + # Skip — but do not re-raise — auth failures so a foreign name mid-batch doesn't + # discard the response payload for items the caller had already legitimately deleted. + # Without this, the client cache never learns about the partial successes and the + # already-deleted records reappear in the UI until the next full refresh. deleted_images: set[str] = set() affected_boards: set[str] = set() for image_name in image_names: @@ -562,7 +566,7 @@ async def delete_images_from_list( deleted_images.add(image_name) affected_boards.add(board_id) except HTTPException: - raise + continue except Exception: pass return DeleteImagesResult( diff --git a/invokeai/app/api/routers/videos.py b/invokeai/app/api/routers/videos.py new file mode 100644 index 00000000000..1206f5647f3 --- /dev/null +++ b/invokeai/app/api/routers/videos.py @@ -0,0 +1,677 @@ +import re +import tempfile +import traceback +from pathlib import Path +from typing import Optional + +from fastapi import Body, HTTPException, Query, Request, Response, UploadFile +from fastapi import Path as PathParam +from fastapi.responses import FileResponse +from fastapi.routing import APIRouter +from pydantic import BaseModel, Field + +from invokeai.app.api.auth_dependencies import CurrentUserOrDefault +from invokeai.app.api.dependencies import ApiDependencies +from invokeai.app.api.routers.images import _assert_board_read_access +from invokeai.app.invocations.fields import MetadataField +from invokeai.app.services.image_records.image_records_common import ImageCategory, ResourceOrigin +from invokeai.app.services.shared.pagination import OffsetPaginatedResults +from invokeai.app.services.shared.sqlite.sqlite_common import SQLiteDirection +from invokeai.app.services.video_records.video_records_common import VideoNamesResult, VideoRecordChanges +from invokeai.app.services.videos.videos_common import ( + AddVideosToBoardResult, + DeleteVideosResult, + RemoveVideosFromBoardResult, + StarredVideosResult, + UnstarredVideosResult, + VideoDTO, + VideoUrlsDTO, +) +from invokeai.app.util.video_thumbnails import probe_video + +videos_router = APIRouter(prefix="/v1/videos", tags=["videos"]) + +# Videos are immutable; set a high max-age (1 year) +VIDEO_MAX_AGE = 31536000 + +# MP4 only — the names service emits `{uuid}.mp4` unconditionally and we don't transcode on +# upload. Accepting .mov/.webm/.mkv here previously caused those containers to be stored +# under a .mp4 name and served with the .mp4 MIME type, which silently broke playback in +# browsers when the container did not match. +ACCEPTED_VIDEO_MIME_PREFIXES = ("video/mp4",) +ACCEPTED_VIDEO_EXTENSIONS = (".mp4",) + +# Per-chunk size for HTTP Range responses (1 MB) +RANGE_CHUNK_SIZE = 1024 * 1024 + +# Upload streaming chunk size (1 MB) and a coarse per-upload size cap. The cap is generous +# because Wan-generated MP4s for long sequences can run into the hundreds of megabytes; +# the goal is to prevent a single client from exhausting RAM, not to be a content policy. +UPLOAD_CHUNK_SIZE = 1024 * 1024 +MAX_UPLOAD_SIZE = 1024 * 1024 * 1024 # 1 GB + + +def _assert_video_owner(video_name: str, current_user: CurrentUserOrDefault) -> None: + """Raise 403 if the current user does not own the video and is not an admin.""" + from invokeai.app.services.board_records.board_records_common import BoardVisibility + + if current_user.is_admin: + return + owner = ApiDependencies.invoker.services.video_records.get_user_id(video_name) + if owner is not None and owner == current_user.user_id: + return + + board_id = ApiDependencies.invoker.services.board_video_records.get_board_for_video(video_name) + if board_id is not None: + try: + board = ApiDependencies.invoker.services.boards.get_dto(board_id=board_id) + if board.user_id == current_user.user_id: + return + if board.board_visibility == BoardVisibility.Public: + return + except Exception: + pass + + raise HTTPException(status_code=403, detail="Not authorized to modify this video") + + +def _assert_video_direct_owner(video_name: str, current_user: CurrentUserOrDefault) -> None: + """Raise 403 if the current user is not the direct owner of the video. + + Intentionally stricter than _assert_video_owner: board-ownership and public-board + fallbacks are NOT honored. Mirrors _assert_image_direct_owner in board_images.py — + board-move operations need to verify the *original* owner, otherwise a user could + move someone else's video onto their own board via the board-owner branch. + """ + if current_user.is_admin: + return + owner = ApiDependencies.invoker.services.video_records.get_user_id(video_name) + if owner is not None and owner == current_user.user_id: + return + raise HTTPException(status_code=403, detail="Not authorized to move this video") + + +def _assert_board_write_access(board_id: str, current_user: CurrentUserOrDefault) -> None: + """Raise 403 if the current user may not mutate the given board. + + Mirrors _assert_board_write_access in board_images.py: admins and the board owner + may write; public boards accept contributions from any user. + """ + from invokeai.app.services.board_records.board_records_common import BoardVisibility + + try: + board = ApiDependencies.invoker.services.boards.get_dto(board_id=board_id) + except Exception: + raise HTTPException(status_code=404, detail="Board not found") + if current_user.is_admin: + return + if board.user_id == current_user.user_id: + return + if board.board_visibility == BoardVisibility.Public: + return + raise HTTPException(status_code=403, detail="Not authorized to modify this board") + + +def _assert_video_read_access(video_name: str, current_user: CurrentUserOrDefault) -> None: + """Raise 403 if the current user may not view the video.""" + from invokeai.app.services.board_records.board_records_common import BoardVisibility + + if current_user.is_admin: + return + owner = ApiDependencies.invoker.services.video_records.get_user_id(video_name) + if owner is not None and owner == current_user.user_id: + return + + board_id = ApiDependencies.invoker.services.board_video_records.get_board_for_video(video_name) + if board_id is not None: + try: + board = ApiDependencies.invoker.services.boards.get_dto(board_id=board_id) + if board.board_visibility in (BoardVisibility.Shared, BoardVisibility.Public): + return + except Exception: + pass + + raise HTTPException(status_code=403, detail="Not authorized to access this video") + + +def _is_accepted_video_upload(file: UploadFile) -> bool: + if file.content_type and file.content_type.startswith(ACCEPTED_VIDEO_MIME_PREFIXES): + return True + if file.filename: + return file.filename.lower().endswith(ACCEPTED_VIDEO_EXTENSIONS) + return False + + +@videos_router.post( + "/upload", + operation_id="upload_video", + responses={ + 201: {"description": "The video was uploaded successfully"}, + 415: {"description": "Video upload failed"}, + }, + status_code=201, + response_model=VideoDTO, +) +async def upload_video( + current_user: CurrentUserOrDefault, + file: UploadFile, + request: Request, + response: Response, + video_category: ImageCategory = Query(description="The category of the video"), + is_intermediate: bool = Query(description="Whether this is an intermediate video"), + board_id: Optional[str] = Query(default=None, description="The board to add this video to, if any"), + session_id: Optional[str] = Query(default=None, description="The session ID associated with this upload, if any"), + metadata: Optional[str] = Body( + default=None, + description="The metadata to associate with the video, must be a stringified JSON dict", + embed=True, + ), +) -> VideoDTO: + """Uploads a video for the current user.""" + # Check board access for uploads to a specific board. + if board_id is not None: + from invokeai.app.services.board_records.board_records_common import BoardVisibility + + try: + board = ApiDependencies.invoker.services.boards.get_dto(board_id=board_id) + except Exception: + raise HTTPException(status_code=404, detail="Board not found") + if ( + not current_user.is_admin + and board.user_id != current_user.user_id + and board.board_visibility != BoardVisibility.Public + ): + raise HTTPException(status_code=403, detail="Not authorized to upload to this board") + + if not _is_accepted_video_upload(file): + raise HTTPException(status_code=415, detail="Not a supported video file") + + # Stream the upload to a tmp file so we can probe and then hand its path to the service. + # Reading the full body into memory first risked exhausting RAM on multi-GB uploads; + # chunk-stream instead and enforce a hard size cap. + tmp = tempfile.NamedTemporaryFile(prefix="invokeai_upload_", suffix=".mp4", delete=False) + tmp_path = Path(tmp.name) + try: + total = 0 + while chunk := await file.read(UPLOAD_CHUNK_SIZE): + total += len(chunk) + if total > MAX_UPLOAD_SIZE: + tmp.close() + raise HTTPException( + status_code=413, + detail=f"Video upload exceeds maximum size ({MAX_UPLOAD_SIZE} bytes)", + ) + tmp.write(chunk) + tmp.close() + + try: + width, height, duration, fps = probe_video(tmp_path) + except Exception: + ApiDependencies.invoker.services.logger.error(traceback.format_exc()) + raise HTTPException(status_code=415, detail="Failed to read video") + + try: + video_dto = ApiDependencies.invoker.services.videos.create( + source_path=tmp_path, + width=width, + height=height, + duration=duration, + fps=fps, + video_origin=ResourceOrigin.EXTERNAL, + video_category=video_category, + session_id=session_id, + board_id=board_id, + metadata=metadata, + workflow=None, + graph=None, + is_intermediate=is_intermediate, + user_id=current_user.user_id, + ) + + response.status_code = 201 + response.headers["Location"] = video_dto.video_url + return video_dto + except Exception: + ApiDependencies.invoker.services.logger.error(traceback.format_exc()) + raise HTTPException(status_code=500, detail="Failed to create video") + finally: + # If create() succeeded the file was moved; this unlink is a no-op then. + try: + tmp_path.unlink(missing_ok=True) + except Exception: + pass + + +@videos_router.delete("/i/{video_name}", operation_id="delete_video", response_model=DeleteVideosResult) +async def delete_video( + current_user: CurrentUserOrDefault, + video_name: str = PathParam(description="The name of the video to delete"), +) -> DeleteVideosResult: + _assert_video_owner(video_name, current_user) + + # Let service-level failures surface as 500s rather than swallowing them and returning a + # success-shaped response. A previous version of this handler caught everything and + # returned an empty ``deleted_videos`` list with HTTP 200; the frontend treated that as + # success, dropped the item from its cache, and the video stayed on disk — a silent + # data-consistency failure that only became visible on the next page reload. + try: + video_dto = ApiDependencies.invoker.services.videos.get_dto(video_name) + except Exception: + raise HTTPException(status_code=404, detail="Video not found") + + board_id = video_dto.board_id or "none" + try: + ApiDependencies.invoker.services.videos.delete(video_name) + except Exception: + raise HTTPException(status_code=500, detail="Failed to delete video") + + return DeleteVideosResult( + deleted_videos=[video_name], + affected_boards=[board_id], + ) + + +@videos_router.post("/delete", operation_id="delete_videos_from_list", response_model=DeleteVideosResult) +async def delete_videos_from_list( + current_user: CurrentUserOrDefault, + video_names: list[str] = Body(description="The list of names of videos to delete", embed=True), +) -> DeleteVideosResult: + # Skip — but do not re-raise — auth failures so a foreign name mid-batch doesn't + # discard the response payload for items the caller had already legitimately deleted. + # Without this, the client cache never learns about the partial successes and the + # already-deleted records reappear in the UI until the next full refresh. + deleted_videos: set[str] = set() + affected_boards: set[str] = set() + for video_name in video_names: + try: + _assert_video_owner(video_name, current_user) + video_dto = ApiDependencies.invoker.services.videos.get_dto(video_name) + board_id = video_dto.board_id or "none" + ApiDependencies.invoker.services.videos.delete(video_name) + deleted_videos.add(video_name) + affected_boards.add(board_id) + except HTTPException: + continue + except Exception: + pass + return DeleteVideosResult( + deleted_videos=list(deleted_videos), + affected_boards=list(affected_boards), + ) + + +@videos_router.patch("/i/{video_name}", operation_id="update_video", response_model=VideoDTO) +async def update_video( + current_user: CurrentUserOrDefault, + video_name: str = PathParam(description="The name of the video to update"), + video_changes: VideoRecordChanges = Body(description="The changes to apply to the video"), +) -> VideoDTO: + _assert_video_owner(video_name, current_user) + try: + return ApiDependencies.invoker.services.videos.update(video_name, video_changes) + except Exception: + raise HTTPException(status_code=400, detail="Failed to update video") + + +@videos_router.get("/i/{video_name}", operation_id="get_video_dto", response_model=VideoDTO) +async def get_video_dto( + current_user: CurrentUserOrDefault, + video_name: str = PathParam(description="The name of video to get"), +) -> VideoDTO: + _assert_video_read_access(video_name, current_user) + try: + return ApiDependencies.invoker.services.videos.get_dto(video_name) + except Exception: + raise HTTPException(status_code=404) + + +@videos_router.get( + "/i/{video_name}/metadata", operation_id="get_video_metadata", response_model=Optional[MetadataField] +) +async def get_video_metadata( + current_user: CurrentUserOrDefault, + video_name: str = PathParam(description="The name of video to get"), +) -> Optional[MetadataField]: + _assert_video_read_access(video_name, current_user) + try: + return ApiDependencies.invoker.services.videos.get_metadata(video_name) + except Exception: + raise HTTPException(status_code=404) + + +def _parse_range_header(range_header: str, file_size: int) -> Optional[tuple[int, int]]: + """Parses an HTTP Range header of the form `bytes=START-END`. Returns inclusive (start, end) + byte offsets, or None if the header is malformed or unsatisfiable.""" + match = re.match(r"^bytes=(\d*)-(\d*)$", range_header.strip()) + if match is None: + return None + start_str, end_str = match.group(1), match.group(2) + if start_str == "" and end_str == "": + return None + if start_str == "": + # suffix range: last N bytes + try: + suffix_len = int(end_str) + except ValueError: + return None + if suffix_len == 0: + return None + start = max(file_size - suffix_len, 0) + end = file_size - 1 + else: + try: + start = int(start_str) + except ValueError: + return None + if end_str == "": + end = file_size - 1 + else: + try: + end = int(end_str) + except ValueError: + return None + if start > end or start >= file_size: + return None + end = min(end, file_size - 1) + return start, end + + +@videos_router.get( + "/i/{video_name}/full", + operation_id="get_video_full", + response_class=Response, + responses={ + 200: {"description": "Return the full video file", "content": {"video/mp4": {}}}, + 206: {"description": "Return a byte-range of the video file", "content": {"video/mp4": {}}}, + 404: {"description": "Video not found"}, + }, +) +@videos_router.head( + "/i/{video_name}/full", + operation_id="get_video_full_head", + response_class=Response, + responses={ + 200: {"description": "Return the full video file", "content": {"video/mp4": {}}}, + 404: {"description": "Video not found"}, + }, +) +async def get_video_full( + request: Request, + video_name: str = PathParam(description="The name of video file to get"), +) -> Response: + """Serves the video file with HTTP Range support so HTML5