
Ivan Koh
Building intelligent systems to navigate financial markets.
Data Scientist passionate about applying quantitative methods to drive innovation in financial markets and investment strategies. My work blends statistical modeling with large-scale data systems to uncover actionable insights. I'm expanding into quantitative research and systematic trading, bringing the same rigor and builder's mindset to alpha-seeking problems.
Experience
A timeline of my key roles in data science, ML engineering, and quantitative research.
Openspace Ventures
Lead Developer and Data Scientist
Full-time●Jul 2023 - Present●Singapore · On-site
- Led data, ML, and infrastructure initiatives, translating investment theses into actionable, data-backed insights.
- Engineered production-grade services and end-to-end ML systems on GCP to identify promising investment opportunities from alternative data.
- Managed the firm's cloud infrastructure and DevOps practices, optimizing for performance, cost, and scalability.
- Drove measurable outcomes for portfolio companies by leading targeted ML projects in areas like demand forecasting and churn modeling.
Python
PyTorch
XGBoost
FastAPI
OpenAI
MLflow
Terraform
GCP
BigQuery
PostgreSQL
Docker
CI/CD
DevOps
National University of Singapore
Teaching Assistant (IS3107 Data Engineering)
Contract●Aug 2023 - Dec 2023●Singapore · On-site
- Conducted weekly lab sessions on data engineering and big data processing hands-on exercises and tutorials
- Guided students in implementing data pipelines using Apache Airflow
- Provided consultation sessions for course projects and assignments
Python
SQL
Apache Spark
Airflow
Datature
Software Engineering Intern
Internship●Jan 2023 - Jun 2023●Singapore · On-site
- Developed a full-stack application using Retool, JavaScript, Express.js, and Docker for real-time data analytics on customer engagement, centralizing sales reporting capabilities
- Implemented a proof-of-concept Cloud Run REST API using Flask and TensorFlow for training deep learning models, enabling customers to train computer vision models via HTTP requests; explored gRPC as an alternative communication protocol for training on edge devices with limited resources
TypeScript
Tensorflow
Python
Flask
Docker
Google Cloud Platform
GIC
Quantitative Research Intern
Internship●Aug 2022 - Nov 2022●Singapore · On-site
- Analyse financial time series through exploratory data analysis (EDA) and present data-driven insights using visualisation with R
- Collaborate with quantitative strategists to generate and evaluate signals based on ideas from academic research to forecast future price movements in R
R
SQL
Accenture
AI/Data Analytics Intern
Internship●May 2022 - Aug 2022●Singapore · On-site
- Implemented client requirements for a conversational AI project during UAT using Node.js and DialogFlow CX.
- Developed automated Python scripts to test the conversational AI and related APIs.
- Analyzed text data to create actionable reports for the AI team.
- Translated business requirements into API logic and automated testing.
Python
JavaScript
DialogFlow
Node.js
PwC
Risk Assurance (Data Trust Services) Intern
Internship●Dec 2021 - Jan 2022●Singapore · On-site
- Collaborated closely with a team of consultants using Agile methodology to to gather business requirements in a supply chain digitalisation project
- Designed mock-ups using Figma to translate business requirements into actionable items for software developers
Figma
Projects
A selection of my work in machine learning, data science, and automation.
Developed statistical models and ML algorithms to predict startup funding success, integrating multiple data sources including web traffic and founder profiles.
Python
PyTorch
XGBoost
MLflow
Machine Learning
Built an end-to-end system combining fine-tuned YOLO models with LLMs for financial document analysis and investment due diligence automation.
PyTorch
HuggingFace
RAG
LLM
Deep Learning
Built encoder-decoder models with attention to generate descriptive image captions. Used PyTorch and Weights & Biases for experiment tracking.
PyTorch
HuggingFace
W&B
Deep Learning
Developed a RAG system for Singapore-specific data queries. Created a vector database with FAISS for efficient similarity search.
Langchain
FAISS
Streamlit
RAG
LLM
Built end-to-end ML pipeline with Dockerized Airflow workflows for data processing and MLFlow for model versioning and tracking.
Docker
Airflow
MLFlow
MLOps