How Chronis Works — AI Video Memory Preservation

Step by step

Upload any video

Your video becomes the raw material for everything that follows. We accept any common format — MP4, MOV, AVI, MKV. We accept any quality — even old family footage or low-resolution WhatsApp clips. You need at least 30 seconds of them speaking, but anything up to 10 minutes gives us significantly more to work with.

The video doesn't need to be a direct-to-camera monologue. A conversation, a speech, a birthday video — all work perfectly. The only requirement is that their voice is audible and their face is visible for at least part of the clip.

MP4 · MOV · AVI · MKV · any format

Face extraction and animation

Our pipeline isolates their face geometry and expression range from your video. This becomes the visual foundation for the real-time avatar — the face that will speak when you start a conversation. We reconstruct facial structure, skin tone, micro-expressions, and blinking patterns.

The lip-sync system then animates this face in real time based on the voice output. The result is a face that moves as they speak — not a static image, not a puppet-on-screen, but a natural-feeling video presence.

Real-time lip sync · face geometry · expression map

Voice cloning

We extract acoustic signatures from the audio in your video — their specific tonal characteristics, accent, speech cadence, pause patterns, and vocal texture. This becomes a clone of their voice that can generate new speech, not just replay recorded audio.

When the replica responds to you, it speaks in their cloned voice. New sentences, new words — in their voice. For Indian languages and accents, our system is specifically tuned to preserve the characteristics that make a voice recognizable to family members.

Voice cloning · accent preservation · real-time TTS

Personality and language modeling

The language model is grounded in contextual information about who this person was — their speech patterns from the video, any memories or context you choose to provide, and a persona framework that prevents the AI from drifting into generic responses. The replica should sound like them, not like a helpful AI assistant.

You can optionally add written memories, descriptions, or recorded context to make the personality richer. The more you provide, the more textured and authentic the conversation feels.

Personality grounding · memory context · speech pattern analysis

Real-time video conversation

When you start a session, everything runs simultaneously — your voice is transcribed, passed to the language model, which generates a response, which is spoken in the cloned voice, which drives the lip-sync animation, all in under two seconds. This is what makes Chronis different: it's a live conversation, not a pre-recorded response tree.

You can say anything. Ask about memories. Ask what they'd think about something happening in your life. Tell them what you've been wanting to say. The conversation is open-ended, real-time, and yours.

Live conversation · under 2s latency · open-ended

Under the hood

The technical stack

Four specialized systems work together to create a coherent, emotionally real experience.

Face pipeline

Face geometry extraction, expression modeling, and real-time lip sync. The visual system that makes their face move naturally as they speak.

Voice cloning engine

Acoustic signature extraction, TTS synthesis in the cloned voice. Preserves accent and cadence characteristics that are unique to that person.

Language model

Personality-grounded conversation model. Trained to respond in their speech style, not in generic AI prose. Memory-persistent across sessions.

Real-time orchestration

The system that connects voice input to transcription to LLM to TTS to avatar in under two seconds. Built for low-latency emotional conversation.

From one video
to a living conversation.

Upload any video

Face extraction and animation

Voice cloning

Personality and language modeling

Real-time video conversation

The technical stack

Face pipeline

Voice cloning engine

Language model

Real-time orchestration

Ready to begin?
Join the waitlist.

From one videoto a living conversation.

Upload any video

Face extraction and animation

Voice cloning

Personality and language modeling

Real-time video conversation

The technical stack

Face pipeline

Voice cloning engine

Language model

Real-time orchestration

Ready to begin?Join the waitlist.

From one video
to a living conversation.

Ready to begin?
Join the waitlist.