Motion Gesture Sign Language Translation Model and Pipeline
Project Stack
Currently, the project is using:
- Pytorch, Pytorch Lightning, Huggingface transformers
- model architecture, training loop, initial model weights
- distributed training (multi-node and multi-gpu)
- Kubernetes/Kubeflow
- Distributed multi-node training
- MLflow (Databricks hosted)
- experiment analytics and tracking
- artifact store
- ONNX and TensorRT
- Desployment and production-ready for inference
- GitHub Actions, Docker
- CI/CD triggering on new production weights (marked in MLflow)
- building inference container
What is this project?
This is an end-to-end MLOps project I started during Summer 2025. The goal was to learn the skills needed to become a proficient ML engineer by exploring the full MLE pipeline.
This is an application of finetuning VideoMAE using the ASL Citizen dataset. The project is a spinoff of a project I led in Autoslug, but I decided to take a different approach and get a MVP working by fall.
Where I started
I started with the skills I was already familiar with: Pytorch, Huggingface (transformers), Kubernetes, and Docker. These are all skills I learned from working on machine learning research throughout my time at UC Santa Cruz.
From there, I started looking into resources such as Full Stack Deep Learning. I highly recommend this for anyone that is interested in MLE, it is an excellent resource.