Technical Work

Projects

From GenAI enterprise platforms to RNN poetry generators — real systems built and shipped.

GenAI Platform

GenAI Studio — CBA Internal Platform

Commonwealth Bank of Australia

2023 – Present

Enterprise GenAI platform serving 15,000+ CBA employees. Enables teams to build and consume LLM-powered applications for content generation, knowledge retrieval (RAG), and process automation.

  • Defined API interfaces and system boundaries through stakeholder engagement
  • Introduced PromptFlow as the Low-Code GenAI orchestration framework
  • Delivered RAG app backed by Azure AI Search + GPT-4, deployed on AWS
  • Extracted reusable patterns into internal Python libraries
  • Defined team Python coding and deployment standards
GPT-4RAGPromptFlowAzure AI SearchAWSPythonOpenAI
Big Data Platform

On-Premises Big Data Platform

WiseTech Global

2019 – 2023

Designed and operated a production-grade on-premises data platform from scratch — replacing cloud spend with a Kubernetes-native stack that the whole analytics org runs on.

  • Stood up Spark 2.x → 3.x cluster on Kubernetes with Ceph as S3-compatible storage
  • Set up JupyterHub + SparkMagic + Livy for self-service PySpark by Data Analysts
  • Deployed and managed Airflow for batch ETL scheduling and monitoring
  • Integrated Kafka → Spark Streaming for near-real-time analytics
  • Evaluated Delta Lake vs Parquet for raw data storage
SparkKubernetesAirflowKafkaDelta LakeCephJupyterHub
ML / Recommender

Software Feature Recommender Engine

WiseTech Global

2020 – 2022

Collaborative Filtering recommender built on Spark MLlib to surface relevant software features to users — driven by behavioural telemetry analysis.

  • Prototyped using matrix factorisation (ALS) via Spark MLlib
  • Ran random search + heuristic global hyper-parameter optimisation
  • Full feature-engineering pipeline from raw user activity logs
Spark MLlibPySparkCollaborative FilteringFeature Engineering
NLP Chatbot

Client-Facing NLP Chatbot

Hyper Anna

2017 – 2019

Production chatbot for enterprise customers — NLU pipeline combining Rasa with SpaCy, BERT, and TensorFlow components.

  • Led architecture and development of the data ingestion pipeline enabling SMB market entry
  • Prototyped BERT semantic search for sales FAQs
  • Introduced and presented Deep Learning (RNNs) company-wide
  • Built internal DevOps Hubot for deployment automation
RasaBERTSpaCyTensorFlowPythonScalaSpark
NLG / Generative AI

Classical Chinese Poetry Generator

Personal Project

2015 – 2016

An RNN-based natural language generation model that writes classical Tang-dynasty poetry. Deployed as a WeChat chatbot with zero marketing — tens of thousands of users, hundreds of thousands of poems generated.

  • Implemented in Python / TensorFlow with LSTM cells (open-sourced on GitHub)
  • Compared GRU vs LSTM cells via A/B testing
  • Full solo operation: product, marketing, analytics, dev, devops
TensorFlowLSTMNLGPythonWeChat View on GitHub ↗
Recommender System

ReWire – News Article Recommender

Fairfax Media

2012 – 2015

Item-based Collaborative Filtering recommendation engine processing billions of records per run on Amazon EMR. A/B tested 36% better CTR than the third-party incumbent.

  • Built with Java / Apache Mahout on on-demand EMR Hadoop clusters (EC2 + S3)
  • Applied business rules: metadata boosting + soft recency filtering
  • Served the full SMH and The Age newsletter audiences
MahoutHadoopEMRJavaS3Collaborative Filtering