Platform Updates

Stensyl Adds Veo 3.1 and Kling 3.0: What Video Creators Need to Know.

By Adam Morgan30 June 202611 min read

Veo 3.1 and Kling 3.0 are now live on Stensyl. Here's what each model does best and how to pick between them.

```html

What Veo 3.1 and Kling 3.0 Actually Add

Stensyl has added two new video generation engines to the platform: Google Veo 3.1 and Kling 3.0. Both are available immediately across all plans, including Free, with credits as the only differentiator in how much you can generate.

You'll find both models in three places. On the Video surface (/generate/video), they appear as selectable engines for direct text-to-video and image-to-video generation. Inside the Film studio, each shot in your timeline carries its own model selector, so you can mix engines across a sequence. And in Canvas, both are available as Video nodes within node-based workflows, letting you chain video generation with other generation types or compare outputs before committing credits to a full sequence.

The two models are not interchangeable. They solve different problems. Veo 3.1 is Google DeepMind's current text-to-video flagship, built around cinematic quality, native audio generation and narrative control across longer sequences. Kling 3.0, developed by Kuaishou, is positioned around physical accuracy, object coherence and surface material fidelity. Choosing the wrong one for a given brief wastes credits and time. This article covers practical use cases for each, across the twelve disciplines Stensyl serves.

What this article is not: a technical breakdown of model architecture or a comparison of inference infrastructure. The goal is to help you decide which engine to reach for, and when.

Both Veo 3.1 and Kling 3.0 are available on every Stensyl plan, from Free upwards. Credits are the only gating factor.

Veo 3.1: Where It Performs Best

Veo 3.1 launched in October 2025 and received further updates in January 2026, adding improved vertical video support and reference-image guidance. Google positions it as a state-of-the-art text-to-video, image-to-video, and text-to-audio-plus-video model, with measurable improvements over Veo 3 in prompt adherence, realism and physics simulation.

The disciplines that benefit most

Motion designers building title sequences, logo stings and typographic animations will get the most immediate value. Veo 3.1 handles cinematic camera moves, realistic lighting and synchronised audio in a single generation pass. A logo reveal that previously required separate video and audio workflows can be generated together from a single prompt.

Film and set designers using pre-visualisation workflows will find the model's scene extension capability particularly useful. Google describes continuous shots extending beyond a minute, built from chained segments, which maps directly to blocking previews and mood pieces where spatial and temporal continuity matter.

Content creators producing social video have native 9:16 output, improved upscaling to 1080p and 4K, and reference-image guidance that maintains character and scene consistency across a series. That last feature is significant for Reels, Shorts or TikTok content where recognisable visual identity matters from clip to clip.

Shot types and prompt styles where Veo 3.1 excels

Cinematic camera movement: dolly shots, tracking moves, crane shots with realistic parallax and physics-driven object dynamics.
Lighting consistency across cuts: when used with reference images and scene extension, Veo 3.1 preserves mood-driven lighting across multiple shots in a sequence.
Longer coherent sequences: chained segments that maintain visual continuity across what Google describes as minute-plus continuous output.
Audio-driven prompts: script-level guidance that generates synchronised dialogue, ambience and sound effects alongside the video output.

Concrete workflow: motion designer in Film studio

A motion designer building a five-scene title sequence opens the Film studio and lays out their shots: intro logo reveal, typography pass, hero shot, call-to-action, and outro. For each shot where camera movement and audio are central, they select Veo 3.1 as the render engine.

Using first/last-frame control, they bridge logo freeze-frames into full motion. Scene extension carries the soundtrack coherently across the sequence without requiring a separate audio pass. The result is a complete sequence with camera work, lighting and music generated in a single Film studio session.

When to use Kling 3.0 instead

Veo 3.1 is optimised for cinematic, narrative-driven and expressive character work. It is not where you want to be for precision product visualisation, strict surface material fidelity or shots where an object's silhouette must hold exactly across every frame. Veo 3.1's documentation frames its consistency improvements around narrative and character work, not industrial-grade product shots.

There is also a generation time consideration. Veo 3.1 runs approximately 8 to 12 per cent slower than Veo 3 when audio is enabled. Expect roughly 90 to 120 seconds for an 8-second clip without audio, and 150 to 180 seconds with audio on standard infrastructure. For rapid concept iteration, Luma Ray 2 Flash is the faster choice for drafts.

Veo 3.1's audio-plus-video generation is its most distinctive capability. If your brief doesn't require synchronised sound, consider whether a faster engine covers the shot adequately.

Kling 3.0: Where It Performs Best

Kling 3.0 is Kuaishou's high-fidelity text-to-video model. Public technical documentation is thinner than Veo's, and independent third-party benchmarks remain limited at the time of writing. What exists in vendor materials and community reporting consistently points in one direction: physical correctness, object coherence and surface material rendering.

The disciplines that benefit most

Product designers rendering hero shots, packshots in motion, or launch teasers need edge definition and material fidelity that holds across every frame. A water bottle with a matte-finish label, a perfume bottle with a glass stopper, a wireless speaker with a woven grille: these surfaces need to behave correctly in motion, not morph or blur across frames. Kling 3.0 is the current default on Stensyl for this category of shot.

Automotive designers exploring colourway reveals and paint finish options need reflections to stay stable across a dolly or orbit shot. Body lines that shift or distort mid-motion make a colourway video unusable as a client-facing deliverable. Vendor and demo material for Kling 3.0 consistently highlights this as a target use case.

Exhibition and spatial designers previewing trade fair booths, retail environments or museum installations need camera paths that maintain spatial continuity. A walkthrough that stutters or loses depth coherence reads as broken to a client reviewing pre-production. Kling 3.0's reported strength in rigid-body motion and camera path accuracy makes it well-suited to these spatial previews.

Where Kling 3.0 differentiates

Physical accuracy: believable rigid-body motion and camera paths through environments.
Object coherence: minimal morphing or shape drift for products across the full shot duration.
Surface material rendering: glossy metals, glass, plastics, textiles and architectural finishes rendered consistently in motion.

These claims are primarily vendor-driven and demo-verified at present. No broadly published third-party lab results exist for Kling 3.0's resolution limits or per-second performance specifications. For definitive credit costs per clip, refer to Stensyl's current pricing page, as no external source confirms these numbers.

Concrete workflow: product designer using Boards and the Video surface

A product designer working on a new packaging reveal opens Boards and sets up a first frame: a straight-on hero shot of the product. They set a last frame: an angled close-up of the base and branding mark. Boards supports first/last-frame sequences that feed directly into video generation.

They pass that sequence to Kling 3.0 via the Video surface. The resulting clip transitions smoothly between both angles while preserving logo legibility, material finishes and edge definition. They take the output to the Editing surface for trimming, captions and audio sync before the client delivery.

How Kling 3.0 sits alongside Luma Ray 3.2 and Luma Ray 2 Flash

Model	Best for	When to use
Veo 3.1	Cinematic narrative, audio-driven content	Final renders where camera work and sound matter
Kling 3.0	Product, material and spatial accuracy	Final renders for hero shots, colourways, walkthroughs
Luma Ray 3.2	General-purpose photoreal video with start/end keyframes	Polished outputs across a broad range of subjects
Luma Ray 2 Flash	Fast, lower-cost video drafts	Concept previews, blocking, animatics before final render

Picking the Right Model for Your Project

The decision is not complicated once you know what each model prioritises. The question is: what does this shot need to do?

A practical decision framework

Veo 3.1: cinematic camera work, mood-driven sequences, narrative beats, social content with sound-on audio, character-driven scenes.
Kling 3.0: product hero shots, colourway reveals, surface material accuracy, spatial walkthroughs, anything where object coherence is non-negotiable.
Luma Ray 2 Flash: fast iteration drafts, blocking previews, animatics. Use this before committing credits to either of the two primary engines.
Luma Ray 3.2: general-purpose photoreal video across a broad range of subjects, with start and end keyframe support for 5 or 10 second clips.

Using Ray to choose and prompt

If you are working in a discipline that does not regularly engage with generative video, starting from a blank prompt is the steepest part of the learning curve. Stensyl's Ray assistant at /ray can take a brief described in plain terms, including discipline, duration, platform and intended use, and return a model recommendation alongside a structured prompt you can take directly into the Video surface or Film studio.

This mirrors the guided workflow approach that tools like Google's Flow surface apply around Veo, but Ray works across all of Stensyl's models and surfaces, not just video. An exhibition designer describing a spatial walkthrough brief gets a different recommendation and a different prompt structure than a content creator planning a Reels series. The output is grounded in your actual brief, not a generic template.

Credit costs and budgeting across plans

No external source confirms Stensyl's per-second or per-clip credit costs for individual models. For precise figures on what Veo 3.1, Kling 3.0, Luma Ray 3.2 and Luma Ray 2 Flash cost in credits per generation, check Stensyl's current pricing page directly.

The practical principle holds regardless of the specific numbers. Both Veo 3.1 and Kling 3.0 are higher-fidelity engines suited to final or near-final renders. Luma Ray 2 Flash is the right tool for exploration and drafts. Use the faster, lower-cost engine to validate direction, then bring in Veo 3.1 or Kling 3.0 for the render that goes to a client or gets published.

Across the paid plans: Lite carries 1,000 credits per month at £10, Starter gives 2,500 at £22, Pro gives 6,000 at £42, and Studio gives 12,500 at £84. Concurrency scales from 1 simultaneous generation on Lite to 4 on Studio, which matters when you are running parallel Canvas comparisons.

Canvas: test both models in parallel

Before committing to a Film studio sequence or a long-form render, use Canvas to compare outputs side by side. Canvas Video nodes let you run the same prompt or the same Boards-sourced keyframe pair through both Veo 3.1 and Kling 3.0 in parallel. You can see how each model handles your subject matter before spending the credits on a full sequence. For a product designer unsure whether the shot calls for Kling's object coherence or Veo's cinematic framing, this is the fastest way to resolve the question.

Use Canvas to run both models on the same shot before committing to Film studio. The comparison costs far fewer credits than re-rendering a full sequence after choosing the wrong engine.

Fitting Both Models Into Your Existing Workflow

Veo 3.1 and Kling 3.0 are not standalone tools. They sit inside the same workflow surfaces you already use on Stensyl, and they integrate with Boards, Film studio, Editing and Avatar in ways that change what each surface is capable of.

Boards feeding into video generation

Boards supports collecting visual references and grouping frames into first/last-frame sequences for video generation. That output feeds directly into either Veo 3.1 or Kling 3.0, giving you keyframe-controlled video generation without writing a scene description from scratch.

For graphic designers, this is the fastest route to animated brand assets. A static logo lockup and a final-frame motion blur endpoint become a Veo 3.1-generated logo sting with a single handoff. For game developers, concept art frames for a cutscene can go from static Boards sequence to motion preview in Kling 3.0, testing whether an environment or character reads correctly in motion before the assets go into production.

Film studio: per-shot model selection

The Film studio's most useful feature for multi-model workflows is the ability to assign a different engine to each shot in a sequence. A marketing team building a campaign video can use Veo 3.1 for the atmospheric mood-setting opener and the emotional close, then switch to Kling 3.0 for the product close-up and the detail shot showing the packaging finish. Both shots live in the same timeline, and the Editing surface handles the final cut.

This mirrors how motion teams in commercial production already work. Expressive narrative footage and precision product footage often come from different sources. Film studio lets you replicate that split inside a single platform session.

The Editing surface as a finishing step

After generation, the Editing surface handles captions via Whisper speech-to-text including karaoke mode, timeline trimming and audio sync, with the option to bake captions directly into the exported MP4. For content creators publishing sound-on Veo 3.1 clips to social platforms, this step handles caption accuracy without leaving Stensyl. For product designers delivering Kling 3.0 renders to clients, trim and export happen in the same session as generation.

Avatar integration

Avatar-generated presenter clips are compatible with Film studio timelines. A product explainer video can combine an Avatar presenter clip introducing the product with Kling 3.0 shots of the product itself in motion, all within a single Film studio sequence. A social promo can open with an Avatar hook, transition to a Veo 3.1 cinematic montage and close with a call-to-action card. Both formats stay inside Stensyl from generation through to export.

Getting Started Without Burning Credits

The Free tier includes one free video render plus 150 one-time credits. These credits do not reset on Free; they are a one-time allocation. Every model on the platform, including Veo 3.1 and Kling 3.0, is accessible on Free. The free render is a direct, no-cost way to test whichever model is most relevant to your work before deciding whether to move to a paid plan.

How to use the free render strategically

Use it on the model that matches your discipline, not the one that sounds most impressive. A motion designer should test Veo 3.1 with a short camera-move prompt. A product designer should test Kling 3.0 with a clean hero shot. An exhibition designer or a web/UX designer exploring motion prototyping for the first time should use the free render on whichever model Ray recommends after describing the brief.

Keep the first test short. A 4 to 8 second clip at a straightforward prompt tells you how each model handles your subject matter, lighting assumptions and motion style. Veo 3.1 with audio enabled generates more slowly than without, so a first test prompt that omits audio is a faster calibration step. Scale to longer durations and layered prompts once you have a clear read on the model's defaults.

Ray as prompt author for less familiar disciplines

Exhibition designers, web/UX designers and interior designers may be approaching generative video for the first time. The conventions of video prompting, shot language, duration framing, camera descriptors, are not obvious if your primary outputs are spatial plans, screen flows or brand systems. Ray at /ray translates a brief written in those disciplines' native language into a prompt built for video generation, and pairs it with a model recommendation. There is no need to learn video prompt conventions before running a first test.

The practical first action

Open the Video surface. Select either Veo 3.1 or Kling 3.0 based on the framework above. Run a single short test shot. Examine how the model handles your subject's motion, material and camera behaviour. Use that output to decide whether to build a full Film studio sequence, whether to run a Canvas comparison first, or whether a different engine is a better fit. That one test shot is the most efficient use of a free render and the clearest path to a confident model choice.

The free render is most valuable when it answers a specific question: does this model handle my subject matter the way I need it to? Keep the first shot short, focused and representative of your actual work.

```

Keep reading.

Ideogram V3 Is Now on Stensyl: Sharper Text, Smarter Layouts

9 min read

How to Build Client-Ready Moodboards with AI in Stensyl

10 min read

Stensyl Write Studio: AI Drafting for Creative Projects

9 min read

Try Stensyl for yourself

Image, video, 3D, chat, and document drafting. Every AI model, one studio. Plans from £10/month.

Explore the Studio

← Back to all articles