Machine Learning Engineer & Statistician

I build ML systems that turn noisy, high-dimensional, and time-dependent data into reliable products and decision workflows. My work spans production inference, sequence and time-series modeling, retrieval and reasoning systems, computer vision, neural and sensor data, and statistical validation.

View Projects Download Resume

Focus

Production ML systems. Training, inference, evaluation, monitoring, and deployment workflows for models that need to work outside notebooks.

Complex data modeling. Statistical and deep learning models for neural, neuromotor, sensor, image, text, and operational time-series data.

Retrieval and reasoning workflows. Systems that combine language models, retrieval, structured outputs, tools, and validation across heterogeneous data sources.

Applied research engineering. Translating ambiguous data problems into measurable model behavior, reproducible experiments, and maintainable software.

About

I am a machine learning engineer and Ph.D. statistician focused on applied AI systems that are accurate, measurable, and usable in real settings. I have built production ML workflows, retrieval and reasoning systems, forecasting models, computer-vision retrieval products, and deep-learning models for biological and sensor signals.

My background is deliberately broad: software engineering at Intel, applied science at AT&T Labs, production ML at Performance Photo and Raft, neuromotor modeling at Meta Reality Labs, and doctoral research in statistical machine learning at Carnegie Mellon. Across those settings, the common thread has been building models and systems for data that is noisy, high-volume, high-dimensional, or changing over time.

I am especially interested in roles that connect modeling depth with production engineering: ML engineering, applied research, research engineering, AI systems, and ML infrastructure.

Selected Work

Production AI Systems for Complex Workflows

Built ML and AI workflows that transform noisy, unstructured, multimodal, and time-dependent inputs into structured outputs for downstream products and decision workflows. The work includes data ingestion, model inference, validation, retrieval, tool execution, monitoring, and deployment across Kubernetes, on-premise, and constrained environments.

Focus areas: production inference, ML pipelines, model evaluation, structured outputs, workflow automation, observability
Technologies: Python, PyTorch, Docker, Kubernetes, MLflow, retrieval systems, sequence models, cloud/on-prem deployment patterns

Retrieval and Reasoning Systems

Developed workflows that combine language models, retrieval, validation, tool execution, and structured actions across heterogeneous data sources. These systems connect natural-language interfaces to downstream workflows by grounding model outputs in retrieved context, schemas, and executable tools.

Focus areas: retrieval-augmented generation, tool calling, structured outputs, orchestration, validation, heterogeneous data integration
Technologies: Python, LLMs, RAG, vector search, knowledge graphs, SPARQL, LangGraph-style orchestration, schema validation

Neural, Sensor, and Time-Series Modeling

Applied statistical modeling, deep learning, and signal-processing methods to noisy biological, neural, neuromotor, and sensor time-series data. Work includes neural population timing, cross-area coupling, denoising, trial-level variability, gesture and intent decoding, and robust feature extraction under subject, session, and task variability.

Focus areas: time-series ML, neural data analysis, sensor modeling, signal processing, uncertainty, statistical validation
Technologies: Python, PyTorch, SciPy, NumPy, Bayesian inference, statistical modeling, dimensionality reduction, signal processing

Built and deployed a deep learning retrieval pipeline for person re-identification, achieving 96% rank-1 retrieval accuracy on production imagery. Connected model outputs to a user-facing search interface and cloud-backed inference services, contributing to a 33% increase in professional photo sales revenue.

Focus areas: computer vision, image retrieval, production inference, user-facing ML, measurable business impact
Technologies: Python, deep learning, computer vision, retrieval, AWS, Angular

Bayesian Forecasting for Large-Scale Time-Series Data

Developed and deployed a Bayesian time-series forecasting model for national cell-tower traffic, improving prediction accuracy by 36% over the production baseline. The work combined statistical modeling, scalable data processing, and model evaluation for high-volume operational time-series data.

Focus areas: time-series forecasting, Bayesian modeling, model evaluation, production analytics, large-scale data
Technologies: Python, Bayesian inference, time-series modeling, statistical validation, production data pipelines

Experience

Machine Learning Engineer — Raft (March 2025 – Present) Building production AI/ML systems for complex data workflows, including retrieval and reasoning systems, sequence-model training and serving, report-processing pipelines, speech-to-text components, heterogeneous data integration, and deployment workflows emphasizing reliability, latency, observability, and maintainability.

Research Scientist Intern — Meta Reality Labs (May 2024 – August 2024) Applied deep learning, statistical modeling, and signal-processing methods to streaming neuromotor sensor data. Built experimental ML workflows for noisy biological time-series signals with emphasis on reproducibility, model validation, and robust inference under subject and task variability.

Machine Learning Engineer — Performance Photo Co. (January 2023 – November 2023) Built and deployed a computer vision retrieval system for person re-identification, reaching 96% rank-1 retrieval accuracy on production imagery and increasing professional photo sales revenue by 33%.

Applied Scientist Intern — AT&T Labs (June 2022 – August 2022) Developed and deployed a Bayesian time-series forecasting model for national cell-tower traffic, improving prediction accuracy by 36% over the production baseline.

Software Engineer — Intel (July 2013 – July 2018) Built production software for statistical analysis of semiconductor production data and applied survival analysis and reliability modeling to reduce product testing cost by 40%.

Research

My doctoral research focused on statistical and machine learning methods for understanding neural population dynamics from large-scale recordings. I worked on methods for estimating neural population burst timing, cross-area coupling, trial-level variability, denoising, and interpretable timing motifs across brain regions. This work reflects a broader interest in developing models that are statistically principled, computationally practical, and useful for interpreting complex biological and time-dependent systems.

See Publications for selected work.

Contact

I am interested in ML engineering, applied research, research engineering, AI systems, and ML infrastructure opportunities involving production AI, statistical machine learning, retrieval and reasoning workflows, sequence modeling, and complex data.