About
I'm Aarjav, a graduate student at Brown working where machine learning meets biology: spatial and single-cell genomics, representation learning, and graph-grounded discovery.
Now
- Lab. the Singh Lab at Brown, building an agentic drug repurposing framework with Prof. Ritambhara Singh.
- Researching. representation learning for spatial transcriptomics through self-supervised pretraining with Prof. Ying Ma.
- Studying. Statistical and AI-Powered Methods for High-Dimensional Genomics Data and Computational Linguistics at Brown.
- Excited by. large-scale Perturb-seq atlases and what they unlock for causal modeling in single cell. E.g. X-Atlas/Pisces [Wang et al.]).
Updated April 2026.
Education
Brown University, Providence, RI
MSc Computer Science [AI + Computational Biology track]GTA: Data Structures, Algorithms, and Intractability.
Relevant Courses: Deep Learning in Genomics; Statistical and AI-Powered Methods for High-Dimensional Genomics Data Analysis.
King's College London, London, UK
BSc Computer Science (Artificial Intelligence) with Management and a Year Abroad — First Class HonorsYear Abroad: AI + Biotechnology, University of Toronto.
Research
Graduate Researcher, Singh Lab
Brown University · advised by Prof. Ritambhara Singh- Building an agentic drug repurposing framework over a biomedical knowledge graph integrating PharmaDB and Hetionet.
- The integrated graph spans roughly 67K nodes and 1.7M edges; drug-gene-disease hypotheses are prioritized across 48 cancer types.
- Used KPaths retrieval for relevant-path subgraphs, experimenting with inference strategies including LLM-as-judge, LLM-ensemble, and LLM-council.
- Constructed ontological mappings between clinical trial data and PharmaDB to validate against Phase 1/2 trial outcomes, in collaboration with Dr. Alejandro Schaffer (NIH/NCI), evaluating with precision@k and F0.5.
Self-Supervised Spatial Transcriptomics (JEPA), Brown University
advised by Prof. Ying Ma- Designing a self-supervised spatial transcriptomics framework based on Joint-Embedding Predictive Architecture that learns tissue representations via spatial masking in latent space.
- Extended a Perceiver encoder with relational encoding to capture spatial neighborhoods on SToCorpus-88M, with coverage across high-resolution single-cell spatial transcriptomics technologies.
- Evaluating on cell-type annotation, spatial domain identification, and gene-expression imputation against scGPT-spatial, STAGATE, and GraphST baselines.
Neuropathology Stage Imputation from snRNA-seq, Brown University
advised by Prof. Ritambhara Singh- Trained a hierarchical transformer on the SEA-AD dataset (Allen Institute) to infer Alzheimer's neuropathology stages (Braak, Thal, CERAD, ADNC) from snRNA-seq profiles across MTG and A9 brain regions.
- Designed donor-level attention architecture to capture cross-cell-type and cross-region structure that flat per-cell models miss.
- Applied batch correction to remove donor bias, then fine-tuned scGPT brain-pretrained embeddings on the AD snRNA-seq dataset to capture disease-relevant signal.
Fascicle Length Segmentation, Vision in Human Robotics Lab @ KCL
advised by Dr. Letizia Gionfrida- Developed a zero-shot Noise2Noise CNN to automate B-mode ultrasound preprocessing, boosting Jaccard Similarity Coefficient accuracy by 11%.
- Tracked fascicle motion using affine optical flow and sparse representations, improving temporal consistency and reducing segmentation drift.
Experience
Graduate Software Engineer, Deutsche Bank
London, UK- Built a real-time risk monitoring system ingesting thousands of trades per second with nanosecond-level latency.
- Designed multi-partition data retrieval stacks in KDB+/Q for 100× faster high-volume queries; added a cross-stack de-duplication layer to eliminate redundant retrieval.
- Shipped a dashboard surfacing noisy usage of market data, yielding ~$400,000 annual cost savings.
Co-Founder, Upsizzle AI
London / San Francisco- Architected a multi-agent AI pipeline for Generative Engine Optimization, orchestrating 10+ LLM and non-LLM workers (NER, web crawlers, scoring models) across OpenAI, Gemini, Grok, and Qwen, assigning models by task, cost, and capability
- Built an autonomous orchestrator with quality-check feedback loops and failure-tolerant redispatch, enabling agents to retry and refine outputs without manual intervention
- Designed a per-client pipeline that generated simulated AI search responses across personas, crawled and summarized 200-600 cited web pages per run, and performed NER and sentiment extraction across all sources
- Implemented weighted scoring systems across brand mentions, sentiment, and source authority to generate automated competitive strategy recommendations
Selected Projects
- Built a protein screening pipeline that decomposes sequences into functional domains and runs parallel structure prediction (ESMFold) and similarity search (Foldseek, Diamond) to catch threats that evade standard BLAST screening.
- Designed an agentic LLM layer that reasons over per-domain signals to assess combinatorial risk from chimeric sequences.
ShardCompute — Distributed Inference Network
- Distributed model-sharding inference that partitions large models across heterogeneous devices beyond single-machine memory limits.
- Coordinator-worker architecture for synchronized tensor computation across shards.
- Fault-tolerant state coordination across workers.
- Relay networking early on; migrated to P2P to reduce relay-hop latency.
Skills
- LanguagesPython · Java · C++ · SQL · KDB+/Q · Scala · TypeScript
- ML / ScientificPyTorch · TensorFlow · scikit-learn · NumPy · pandas · SciPy · vLLM · MLX · Scanpy · AnnData · OpenCV
- InfrastructureAWS (EC2, Lambda, Step, S3, DynamoDB) · GCP · Docker · SLURM · Linux · Git · Neo4j · MLflow · Vertex AI
Contacts
- Personalaarjav02@gmail.com
- Workaarjav_jain@brown.edu
- Phone+1 (401) 346-2910
- LinkedInin/aarjav-jain
Last revised April 2026.
