Senior Research Scientist, Infrastructure System Lab

Seattle·Infrastructure·engineering
Apply on ByteDance (TikTok) →

About the Team We are the Infrastructure System Lab — a hybrid research and engineering group building the next-generation AI-native data infrastructure. Our work sits at the intersection of databases, large-scale systems, and AI. We drive innovation across: - Next-generation databases: We build VectorDBs and multi-modal AI-native databases designed to support large-scale retrieval and reasoning workloads. - AI for Infra: We leverage machine learning to build intelligent algorithms for infrastructure optimization, tuning, and observability. - LLM Copilot: We develop LLM-based tooling like NL2SQL, NL2Chart. - High-performance cache systems: We develop a multi-engine key-value store optimized for distributed storage workloads. We're also building KV caches for LLM inference at scale. This is a highly collaborative team where researchers and engineers work side-by-side to bring innovations from paper to production. We publish, prototype, and build robust systems deployed across key products used by millions. About the Role We are seeking a highly motivated and technically strong Research Scientist with a PhD in Computer Science, Database, Information Retrieval, or a related field to join our team. You will work on designing and optimizing state-of-the-art vector indexing algorithms to power large-scale similarity search, filtered search, and hybrid retrieval use cases. Your work will directly contribute to the next-generation vector database infrastructure that supports real-time and offline retrieval across billions or even trillions of high-dimensional vectors. Why Join Us - Work on problems at the frontier of AI x systems with huge practical impact. - Collaborate with a world-class team of researchers and engineers. - Opportunity to publish, attend conferences, and contribute to open-source. - Competitive compensation, generous research support, and a culture of innovation. Responsibilities - Research and develop new algorithms for approximate nearest neighbor (ANN) search, especially for filtered, hybrid, or disk-based scenarios. - Optimize existing algorithms for scalability, low latency, memory footprint, and hybrid search support. - Collaborate with engineering teams to prototype, benchmark, and productionize indexing solutions. - Contribute to academic publications, open-source libraries, or internal technical documentation. - Stay current with research trends in vector search, retrieval systems, retrieval-augmented generation (RAG), large language models (LLMs), and related areas.

More open roles at ByteDance (TikTok)