Immunosequencing Algorithms Laboratory

We decode the
language of immunity.

We read the adaptive immune receptor repertoire — millions of sequences that record every infection, vaccine and tumour the immune system has met — and build the algorithms, databases and structural models that turn that signal into understanding.

Every immune repertoire is a living archive. We learn to read it — linking sequence, structure and specificity to tell self from non-self.

Research

Five questions we are chasing.

  1. 01

    T-cell repertoire annotation

    Mapping the antigen specificities hidden in high-throughput TCR sequencing data, and curating a database of receptors with known targets. We infer sequence-motif biomarkers statistically and connect repertoire structure to the immunogenicity of its cognate antigens.

  2. 02

    Modelling the TCR:pMHC complex

    In-silico modelling of TCR:peptide:MHC structures, tying biophysics of interactions to systems biology of repertoires. We study immune evasion and detection, thymic selection and T-cell differentiation. We build statistical models of TCR:pMHC contacts to predict binding for unseen epitopes.

  3. 03

    Epitope immunogenicity

    Ranking foreign and self peptides by their predicted ability to provoke a response. We search for the physicochemical signatures the adaptive immune system uses as a self-versus-non-self classifier, and turn them into tools for selecting optimal neoantigen targets.

  4. 04

    Viral infections and vaccination

    Screening thousands of repertoire datasets from vaccinated, convalescent, previously-infected and healthy donors to find correlates of immunity in COVID-19 "involuntary experiment" — immunogenic SARS-CoV-2 epitopes, the TCRs that recognise them, and the imprint that past and ongoing infection and vaccination leave behind.

  5. 05

    Cancer immunotherapy

    Using TCR and BCR profiling of tumour and blood RNA-seq and single-cell data to stratify patients, predict therapy outcomes and inform prognosis — and combining immunogenicity, repertoire profiling and complex modelling to design neoantigen vaccines.

Software

Open tools, built in the open.

Everything we make ships free and open-source — from curated data to blazingly fast and accurate algorithms.

A curated database of T-cell receptor sequences with known antigen specificities — the starting point for training and benchmarking machine-learning models of antigen recognition.

vdjdb.com

Matches T-cell receptor repertoires to their antigen-specificity profiles, annotating raw sequencing output with likely targets.

github:vdjmatch

Exploratory analysis of large volumes of immune-repertoire data — turning thousands of AIRR-formatted files into comprehensive statistical summaries via parquet and polars.

github:vdjtools

A universal immune-repertoire analysis library implementing a comprehensive set of state-of-the-art algorithms. Drop it into pipelines and notebooks with ease — Claude Code and GitHub Copilot ready, via skills.

github:mirpy

Annotate, score and rank TCR:pMHC structures — modelled or native — and predict binding and affinity.

github:tcren

Ultrafast V-D-J allele and CDR/FR region mapper for nucleotide and protein immune-receptor sequences — from small FASTA files to large FASTQ volumes across amplicon, RNA- and single-cell sequencing. Built on mmseqs2.

github:arda
Publications

Selected work.

A line through a decade of immune-repertoire research, from error correction to structure-based prediction.

Journal cover
In press · 2026

Towards high-quality, large-scale T-cell receptor antigen-specificity data: challenges and promises

Provisionally accepted — full text to be announced.

Nature Computational Science cover
Nature Computational Science · 2024

Structure-based prediction of T-cell receptor recognition of unseen epitopes using TCRen

Leveraging protein structural data to predict TCR–peptide interactions for unseen epitopes — useful for cancer immunotherapy, autoimmunity and vaccine design. Featured in News & Views.

Nature Methods cover
Nature Methods · 2022

VDJdb in the pandemic era: a compendium of T-cell receptors specific for SARS-CoV-2

Millions of TCR sequences have now been isolated from donors with COVID-19. This VDJdb release incorporates TCR-specificity data from across the SARS-CoV-2 literature.

Immunity cover
Immunity · 2020

SARS-CoV-2 epitopes are recognised by a public and diverse repertoire of human T-cell receptors

T-cell responses were stronger in donors sampled during the pandemic than before it — hinting at cross-reactive protection — with public, germline-featured TCR motifs against two A02-restricted epitopes.

Nucleic Acids Research cover
NAR Database · 2018

VDJdb: a curated database of T-cell receptor sequences with known antigen specificity

Aggregating TCR-specificity information continuously into a curated, public repository — the ability to recognise known epitopes presented by known MHC class I and II molecules.

Nature Methods cover
Nature Methods · 2014

Towards error-free profiling of immune repertoires

MIGEC — molecular-identifier-group-based error correction — enables near-absolute error correction of high-throughput sequencing data while preserving the natural diversity of complex repertoires.

People

The lab.

Members

Alumni

Students

Affiliations

Where we work.

Moscow, Russia

Collaborations

Partners across the world.

Our work is built with an international network of immunologists, structural biologists and computational scientists.