Research

FlyMy.AI: Research Breakthroughs

We're building the platform that will power every real-time agent of the next decade, starting with high-throughput media generation.

AI compilers and inference

GenAI media and diffusion

Large-scale serving infra

Real-time agent reasoning

Today

Powering high-throughput media agents

FlyMy agents combine cutting-edge model research with battle-tested serving infrastructure, so you can ship production-grade experiences without rebuilding the stack from scratch.

One endpoint. Latency-optimized routing. Production-ready by design.

2018 AI Compiler

NVIDIA TensorRT compiler

Production-grade graph optimizations for deep learning, powering latency-sensitive inference at scale.

View research

2019 AI Infra

NVIDIA Megatron-LM & Triton Inference Server

Large language model training and serving stacks that defined the playbook for multi-GPU systems.

View research

2020-2024 AI Compiler

GPT-3 inference, FasterTransformer & custom Triton kernels

End-to-end compilers and kernels that outperform vanilla PyTorch on real-world LLM workloads.

View research

2021-2024 GenAI Media

From StarGAN‑v2 to Kandinsky‑4 & VideoDALL‑E

Fast-train architectures and diffusion models for image and video generation in production.

View research

2022-2024 GenAI

Expressive speech & multimodal embeddings

PitchFlow, emotion-embedding TTS and RuCLIP-tiny enable richer, controllable media agents.

View research

2018 GenAI Media

AlexNet‑3D for early video understanding

One of the first 3D convolutional architectures deployed for real-world video and media understanding tasks.

View research

A team shipping AI breakthroughs

Explore the research, compilers, models, and systems our team has shipped over the past decade.

AlexNet-3D · GenAI Media · 2018 ImproveYourVideos architecture · GenAI Media · 2024 RuCLIP-tiny · GenAI Media · 2022 VideoDALL-E · GenAI Media · 2022 Imagen-PyTorch · GenAI Media · 2022

Emotion-Embedding TTS · GenAI · 2023

PitchFlow expressive TTS · GenAI · 2024

Kandinsky-2.0 · GenAI Media · 2022 Kandinsky-2.1 · GenAI Media · 2023 Kandinsky-2.2 · GenAI Media · 2023 Kandinsky-4 diffusion · GenAI Media · 2024 Kandinsky Video · GenAI Media · 2024 StarGAN-v2 fast-train · GenAI Media · 2021 Outperforming PyTorch with custom Triton kernels · AI Compiler · 2025 NVIDIA FasterTransformer · AI Infra · 2020 NVIDIA Triton Inference Server · AI Infra · 2019 NVIDIA Megatron-LM · AI Infra · 2019 Pipeline-Parallel GPT-3 & GPT-J · AI Infra · 2021 GPT-3 inference · AI Compiler · 2024 NVIDIA TensorRT compiler · AI Compiler · 2018