Aarjav Jain

AI × Bio. Graduate Researcher at Brown.

Providence, RI · contact

About

I'm Aarjav, a graduate student at Brown working where machine learning meets biology: spatial and single-cell genomics, representation learning, and graph-grounded discovery.

Now

Lab. the Singh Lab at Brown, building an agentic drug repurposing framework with Prof. Ritambhara Singh.
Researching. representation learning for spatial transcriptomics through self-supervised pretraining with Prof. Ying Ma.
Studying. Statistical and AI-Powered Methods for High-Dimensional Genomics Data and Computational Linguistics at Brown.
Excited by. large-scale Perturb-seq atlases and what they unlock for causal modeling in single cell. E.g. X-Atlas/Pisces [Wang et al.]).

Updated April 2026.

Education

Brown University, Providence, RI
Expected April 2027
MSc Computer Science [AI + Computational Biology track]
GTA: Data Structures, Algorithms, and Intractability.
Relevant Courses: Deep Learning in Genomics; Statistical and AI-Powered Methods for High-Dimensional Genomics Data Analysis.
King's College London, London, UK
July 2024
BSc Computer Science (Artificial Intelligence) with Management and a Year Abroad — First Class Honors
Year Abroad: AI + Biotechnology, University of Toronto.

Research

Graduate Researcher, Singh Lab
Sep 2025 — present
Brown University · advised by Prof. Ritambhara Singh
- Building an agentic drug repurposing framework over a biomedical knowledge graph integrating PharmaDB and Hetionet.
- The integrated graph spans roughly 67K nodes and 1.7M edges; drug-gene-disease hypotheses are prioritized across 48 cancer types.
- Used KPaths retrieval for relevant-path subgraphs, experimenting with inference strategies including LLM-as-judge, LLM-ensemble, and LLM-council.
- Constructed ontological mappings between clinical trial data and PharmaDB to validate against Phase 1/2 trial outcomes, in collaboration with Dr. Alejandro Schaffer (NIH/NCI), evaluating with precision@k and F0.5.
Self-Supervised Spatial Transcriptomics (JEPA), Brown University
Mar 2026 — present
advised by Prof. Ying Ma
- Designing a self-supervised spatial transcriptomics framework based on Joint-Embedding Predictive Architecture that learns tissue representations via spatial masking in latent space.
- Extended a Perceiver encoder with relational encoding to capture spatial neighborhoods on SToCorpus-88M, with coverage across high-resolution single-cell spatial transcriptomics technologies.
- Evaluating on cell-type annotation, spatial domain identification, and gene-expression imputation against scGPT-spatial, STAGATE, and GraphST baselines.
Neuropathology Stage Imputation from snRNA-seq, Brown University
October 2025 — December 2025
advised by Prof. Ritambhara Singh
- Trained a hierarchical transformer on the SEA-AD dataset (Allen Institute) to infer Alzheimer's neuropathology stages (Braak, Thal, CERAD, ADNC) from snRNA-seq profiles across MTG and A9 brain regions.
- Designed donor-level attention architecture to capture cross-cell-type and cross-region structure that flat per-cell models miss.
- Applied batch correction to remove donor bias, then fine-tuned scGPT brain-pretrained embeddings on the AD snRNA-seq dataset to capture disease-relevant signal.
Fascicle Length Segmentation, Vision in Human Robotics Lab @ KCL
Sep 2023 — May 2024
advised by Dr. Letizia Gionfrida
- Developed a zero-shot Noise2Noise CNN to automate B-mode ultrasound preprocessing, boosting Jaccard Similarity Coefficient accuracy by 11%.
- Tracked fascicle motion using affine optical flow and sparse representations, improving temporal consistency and reducing segmentation drift.

Experience

Graduate Software Engineer, Deutsche Bank
Jul 2024 — Aug 2025
London, UK
- Built a real-time risk monitoring system ingesting thousands of trades per second with nanosecond-level latency.
- Designed multi-partition data retrieval stacks in KDB+/Q for 100× faster high-volume queries; added a cross-stack de-duplication layer to eliminate redundant retrieval.
- Shipped a dashboard surfacing noisy usage of market data, yielding ~$400,000 annual cost savings.
Co-Founder, Upsizzle AI
Mar 2025 — Jun 2025
London / San Francisco
- Architected a multi-agent AI pipeline for Generative Engine Optimization, orchestrating 10+ LLM and non-LLM workers (NER, web crawlers, scoring models) across OpenAI, Gemini, Grok, and Qwen, assigning models by task, cost, and capability
- Built an autonomous orchestrator with quality-check feedback loops and failure-tolerant redispatch, enabling agents to retry and refine outputs without manual intervention
- Designed a per-client pipeline that generated simulated AI search responses across personas, crawled and summarized 200-600 cited web pages per run, and performed NER and sentiment extraction across all sources
- Implemented weighted scoring systems across brand mentions, sentiment, and source authority to generate automated competitive strategy recommendations

Selected Projects

Lucid Bio — Screening Layer for Biosynthesis
2025
- Built a protein screening pipeline that decomposes sequences into functional domains and runs parallel structure prediction (ESMFold) and similarity search (Foldseek, Diamond) to catch threats that evade standard BLAST screening.
- Designed an agentic LLM layer that reasons over per-domain signals to assess combinatorial risk from chimeric sequences.
ShardCompute — Distributed Inference Network
Jan 2026
- Distributed model-sharding inference that partitions large models across heterogeneous devices beyond single-machine memory limits.
- Coordinator-worker architecture for synchronized tensor computation across shards.
- Fault-tolerant state coordination across workers.
- Relay networking early on; migrated to P2P to reduce relay-hop latency.

Skills

LanguagesPython · Java · C++ · SQL · KDB+/Q · Scala · TypeScript
ML / ScientificPyTorch · TensorFlow · scikit-learn · NumPy · pandas · SciPy · vLLM · MLX · Scanpy · AnnData · OpenCV
InfrastructureAWS (EC2, Lambda, Step, S3, DynamoDB) · GCP · Docker · SLURM · Linux · Git · Neo4j · MLflow · Vertex AI

Contacts

Last revised April 2026.

Brown University, Providence, RI

King's College London, London, UK

Graduate Researcher, Singh Lab

Self-Supervised Spatial Transcriptomics (JEPA), Brown University

Neuropathology Stage Imputation from snRNA-seq, Brown University

Fascicle Length Segmentation, Vision in Human Robotics Lab @ KCL

Graduate Software Engineer, Deutsche Bank

Co-Founder, Upsizzle AI

Lucid Bio — Screening Layer for Biosynthesis

ShardCompute — Distributed Inference Network