- Assisted 100+ students building end-to-end cloud analytics workflows—from ingestion to scalable storage and distributed processing.
- Engineered data transformation pipelines with PySpark in Zeppelin, accelerating insights on unstructured datasets.
- Orchestrated MapReduce jobs on distributed VMs for parallel big-data analysis.
- Managed scalable cloud storage with AWS S3 + CLI, integrating virtualization and cloud networking for robust analytics infrastructure.
PySpark
Zeppelin
MapReduce
AWS
Cloudlab
Cloud Computing