Multi-agent debate system for pre-game NBA line prediction. Ingested live odds, statistical, and contextual data across six sources, implemented a ReAct-style agent loop with ChromaDB retrieval, and benchmarked multi-agent debate against single-agent chain-of-thought using Brier scores and calibration curves. Ran ablations isolating each data source and backtested against closing line value on a held-out season.
Open to full-time roles · Graduating December 2026
Aaditya Pai
ML Researcher, LLM Agent Security
Columbia MS Data Science · Purdue BS Computer Engineering · New York, NY
About
I'm an ML researcher focused on the security of LLM agent systems, where I build attacks, defenses, and evaluation frameworks for agents deployed in the real world. I'm currently a Graduate Researcher at Columbia's DAPLab (Data, Agents, and Processes Lab), where I work on prompt injection, unsafe tool use, and policy enforcement across multi-step workflows, and a Software Engineer Intern at CodeIntegrity. Alongside the security work, I'm drawn to quantitative finance and to problems where careful evaluation meets real deployment constraints.
Research & Publications
Selected work on the security and adversarial robustness of LLM agents: attacks, defenses, and evaluation.
- arXiv:2606.185302026
Evaluating Prompting-Based Defenses Against Domain-Camouflaged Injection Attacks
Aaditya Pai
- arXiv:2605.220012026
Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems
Aaditya Pai
- arXiv:2606.174672026
PARSE: Provenance-Aware Retrieval Sanitization for Professional Domain LLM Agents
Aaditya Pai
- OpenReview2026
Deterministic Data Flow Control Improves Agent Utility and Reduces Safety Violations
Charlie Summers, Prajwal Raghunath, Aaditya Pai, Mayur Kulkarni, Zhuo Zhang, Oliver Kennedy, Eugene Wu
Experience
Software Engineer Intern · CodeIntegrity
Jul 2026 – PresentNew York, NY
- Funded AI agent security startup. Building next-generation agent security benchmarking and evaluation infrastructure.
Graduate Researcher · Columbia DAPLab (Data, Agents, and Processes Lab)
Feb 2026 – PresentNew York, NY
- Researching adversarial robustness and security of LLM agent systems: prompt injection, unsafe tool use, and policy enforcement across multi-step workflows.
- Built evaluation infrastructure and attack pipelines across AgentDojo, WebArena, InjecAgent, and ToolBench to measure detector failure modes, attack transferability, and agent vulnerability under adversarial conditions.
- Designed domain-camouflaged injection attacks and provenance-aware defenses for multi-agent systems.
Data Engineer · Cummins Inc.
Jul 2023 – Jul 2025Chicago, IL
- Engineered anomaly detection and validation pipelines over millions of enterprise telemetry records using Python and SQL; generated automated action lists that reduced annual spend by $550K through systematic identification of inactive licenses and unused resources.
- Built scalable ETL automation frameworks with Python, SQL, PowerShell, and REST APIs; defined data quality checks for missing values, anomalous distributions, and schema drift across staging and production workflows.
Software Engineer Intern · ADVANCE.AI
May 2022 – Aug 2022Singapore
- Deployed CI/CD workflows and service routing infrastructure for distributed ML services; integrated Istio service mesh within Kubeflow across 3+ microservices to improve observability and deployment reliability.
- Engineered real-time monitoring and anomaly detection for a 10+ node Kubernetes cluster on Linux; defined threshold-based alerting rules for latency, request size, and data transfer.
Undergraduate Researcher · Autonomous Motorsports Purdue (AMP)
Aug 2022 – May 2023West Lafayette, IN
- Built a CNN with ELU activations for real-time steering prediction in a simulated autonomous driving environment, with a hybrid FAST-ORB feature pipeline and a regression steering model trained on 45K+ frames of sensor data, reaching under 5 degrees mean absolute error at 30 FPS and improving obstacle avoidance accuracy by 30% under real-time latency constraints.
- Won the "Share with the World" VIP Award at the Purdue Undergraduate Research Conference; benchmarked feature extraction methods, evaluated model stability across simulation conditions, and optimized the inference pipeline for real-time deployment.
Undergraduate Researcher · Lunabotics (NASA Robotic Mining Competition)
Aug 2021 – Dec 2021West Lafayette, IN
- Benchmarked object detection and tracking algorithms under noisy visual conditions across varied lighting and terrain, analyzing accuracy, stability, and failure modes; selected the MOSSE filter through systematic evaluation, reaching 95% tracking accuracy under real-time constraints.
- Built an end-to-end OpenCV perception pipeline integrating detection and tracking for autonomous navigation in the NASA Lunabotics Mining Competition environment, validated across edge cases with documented, reproducible evaluation methodology.
Projects
Reimplemented the Avellaneda and Lee (2010) statistical arbitrage framework from scratch: PCA factor decomposition, OU residual modeling, s-score signal generation, HMM regime filtering, and Almgren-Chriss execution. Extended it with HMM regime filtering to suppress signals in trending markets, volatility-targeted position sizing, and optimal execution modeling. Interactive React frontend for backtesting with automated Sharpe ratio, drawdown, and turnover reporting.
Privacy-preserving authorization protocol for autonomous AI agents using zk-SNARKs. Agents prove compliance with identity and scope constraints without revealing credentials. Implemented Groth16 circuits in circom, deployed Solidity verifier contracts on an Ethereum testnet, and benchmarked proof generation time, R1CS constraints, and on-chain gas costs.
High-performance limit order book in C++ with price-time priority matching, a custom object-pool allocator, and an Avellaneda-Stoikov market maker. Achieved 166ns median and 750ns p99 matching latency across 10M+ order events. Validated reconstructed book state against LOBSTER NASDAQ tick data over 400K real market events, and implemented a FIX 4.2 protocol parser with an end-to-end order round-trip demo.
Skills
Languages
- Python
- C++
- C
- SQL
- R
- SAS
- MATLAB
- Solidity
- PowerShell
- Embedded C
- System Verilog
- VHDL
- Assembly
ML & AI
- PyTorch
- TensorFlow
- scikit-learn
- pandas
- NumPy
- statsmodels
- LangChain
- ChromaDB
- OpenCV
Cloud & Infrastructure
- AWS
- GCP
- Azure
- Vertex AI
- Docker
- Kubernetes
- Kubeflow
- Helm
- Amazon EKS
- CI/CD
- Git
- Linux
- Grafana
- Prometheus
- BigQuery
- Firebase
Frontend & Visualization
- React
- Power BI
- Tableau
- Matplotlib
- Jupyter
Focus Areas
- LLM agent security
- prompt injection
- adversarial robustness
- RAG
- multi-agent systems
- quantitative modeling
- backtesting
Education
Columbia University
M.S. in Data Science
Purdue University
B.S. in Computer Engineering
Awards & Certifications
Awards
Share with the World ML Research Award
Purdue Undergraduate Research Exposition
Eta Kappa Nu (IEEE-HKN), Beta Chapter
ECE Honor Society
Dean's List & Semester Honors
7 of 8 semesters
Charles W. Brown ECE Scholarship
Purdue ECE
Eli Shay Electrical Engineering Scholarship
Purdue ECE
Certifications
Advanced Calculus for Financial Engineering
Baruch College, CUNY
Financial Markets
Yale University
Managing ML Projects with Google Cloud
Google
Software Engineering Virtual Experience
JPMorgan Chase & Co.
Intermediate C++
Microsoft