Ivan Koh

Ivan Koh

Building intelligent systems to navigate financial markets.

Data Scientist passionate about applying quantitative methods to drive innovation in financial markets and investment strategies. My work blends statistical modeling with large-scale data systems to uncover actionable insights. I'm expanding into quantitative research and systematic trading, bringing the same rigor and builder's mindset to alpha-seeking problems.

Experience

A timeline of my key roles in data science, ML engineering, and quantitative research.

Openspace Ventures

Lead Developer and Data Scientist

Full-timeJul 2023 - PresentSingapore · On-site
  • Led data, ML, and infrastructure initiatives, translating investment theses into actionable, data-backed insights.
  • Engineered production-grade services and end-to-end ML systems on GCP to identify promising investment opportunities from alternative data.
  • Managed the firm's cloud infrastructure and DevOps practices, optimizing for performance, cost, and scalability.
  • Drove measurable outcomes for portfolio companies by leading targeted ML projects in areas like demand forecasting and churn modeling.
Python
PyTorch
XGBoost
FastAPI
OpenAI
MLflow
Terraform
GCP
BigQuery
PostgreSQL
Docker
CI/CD
DevOps

National University of Singapore

Teaching Assistant (IS3107 Data Engineering)

ContractAug 2023 - Dec 2023Singapore · On-site
  • Conducted weekly lab sessions on data engineering and big data processing hands-on exercises and tutorials
  • Guided students in implementing data pipelines using Apache Airflow
  • Provided consultation sessions for course projects and assignments
Python
SQL
Apache Spark
Airflow

Datature

Software Engineering Intern

InternshipJan 2023 - Jun 2023Singapore · On-site
  • Developed a full-stack application using Retool, JavaScript, Express.js, and Docker for real-time data analytics on customer engagement, centralizing sales reporting capabilities
  • Implemented a proof-of-concept Cloud Run REST API using Flask and TensorFlow for training deep learning models, enabling customers to train computer vision models via HTTP requests; explored gRPC as an alternative communication protocol for training on edge devices with limited resources
TypeScript
Tensorflow
Python
Flask
Docker
Google Cloud Platform

GIC

Quantitative Research Intern

InternshipAug 2022 - Nov 2022Singapore · On-site
  • Analyse financial time series through exploratory data analysis (EDA) and present data-driven insights using visualisation with R
  • Collaborate with quantitative strategists to generate and evaluate signals based on ideas from academic research to forecast future price movements in R
R
SQL

Accenture

AI/Data Analytics Intern

InternshipMay 2022 - Aug 2022Singapore · On-site
  • Implemented client requirements for a conversational AI project during UAT using Node.js and DialogFlow CX.
  • Developed automated Python scripts to test the conversational AI and related APIs.
  • Analyzed text data to create actionable reports for the AI team.
  • Translated business requirements into API logic and automated testing.
Python
JavaScript
DialogFlow
Node.js

PwC

Risk Assurance (Data Trust Services) Intern

InternshipDec 2021 - Jan 2022Singapore · On-site
  • Collaborated closely with a team of consultants using Agile methodology to to gather business requirements in a supply chain digitalisation project
  • Designed mock-ups using Figma to translate business requirements into actionable items for software developers
Figma

Projects

A selection of my work in machine learning, data science, and automation.

Startup Funding Success Prediction
Developed statistical models and ML algorithms to predict startup funding success, integrating multiple data sources including web traffic and founder profiles.
Python
PyTorch
XGBoost
MLflow
Machine Learning
LLM-Powered Visual Document Understanding
Built an end-to-end system combining fine-tuned YOLO models with LLMs for financial document analysis and investment due diligence automation.
PyTorch
HuggingFace
RAG
LLM
Deep Learning
Image Captioning with Deep Learning
Built encoder-decoder models with attention to generate descriptive image captions. Used PyTorch and Weights & Biases for experiment tracking.
PyTorch
HuggingFace
W&B
Deep Learning
LLM-Powered Data Analyst
Developed a RAG system for Singapore-specific data queries. Created a vector database with FAISS for efficient similarity search.
Langchain
FAISS
Streamlit
RAG
LLM
Housing MLOps Pipeline
Built end-to-end ML pipeline with Dockerized Airflow workflows for data processing and MLFlow for model versioning and tracking.
Docker
Airflow
MLFlow
MLOps
Kickstarter Campaign Success Predictor
Built a model predicting Kickstarter campaign success to help backers optimize investments by analyzing historical data to identify key success factors.
Python
Scikit-learn
Data Analysis
Machine Learning