Motion Gesture Sign Language Translation Model and Pipeline

Project GitHub

Project Stack

Currently, the project is using:

  • Pytorch, Pytorch Lightning, Huggingface transformers
    • model architecture, training loop, initial model weights
    • distributed training (multi-node and multi-gpu)
  • Kubernetes/Kubeflow
    • Distributed multi-node training
  • MLflow (Databricks hosted)
    • experiment analytics and tracking
    • artifact store
  • ONNX and TensorRT
    • Desployment and production-ready for inference
  • GitHub Actions, Docker
    • CI/CD triggering on new production weights (marked in MLflow)
    • building inference container

What is this project?

This is an end-to-end MLOps project I started during Summer 2025. The goal was to learn the skills needed to become a proficient ML engineer by exploring the full MLE pipeline.

This is an application of finetuning VideoMAE using the ASL Citizen dataset. The project is a spinoff of a project I led in Autoslug, but I decided to take a different approach and get a MVP working by fall.

Where I started

I started with the skills I was already familiar with: Pytorch, Huggingface (transformers), Kubernetes, and Docker. These are all skills I learned from working on machine learning research throughout my time at UC Santa Cruz.

From there, I started looking into resources such as Full Stack Deep Learning. I highly recommend this for anyone that is interested in MLE, it is an excellent resource.