Dhruthick Mohan
CSE Graduate @ UC San Diego
Hi,
I’m a Computer Science graduate from UC San Diego with a Master’s Degree. I have close to two years of hands-on experience in data science and machine learning, particularly in analyzing high-volume geospatial data across various domains. I also have a solid foundation in Python, Java, SQL, and frameworks such as PySpark, TensorFlow, PyTorch, SciKit-Learn, Numpy, and Pandas.


I’ve worked on projects involving traffic modeling, air quality analysis, music recommendation, and more. At UC San Diego, I specializied in AI and Data Science, where my focus evolved towards recommender systems, NLP, fairness in AI, and AI’s role in diverse domains such as healthcare, disaster management, and streaming services.

I hold a Bachelor's degree in Information Science from Ramaiah Institute of Technology, Bangalore.

I am actively seeking full-time opportunities starting in August 2024, with a focus on Machine Learning, MLOps, Data Science, and Data Analytics.
Industry Experience

Data Scientist

August 2021 - June 2022: India Urban Data Exchange (IUDX)

  • At IUDX, I was part of the Analytics team where my most prominent contribution was a novel road network construction algorithm. The algorithm utilized public bus transit data from across the city and Uber's open-source H3 framework to build road segments. The network was then used to build and deploy a state-of-the-art Temporal Graph Convolutional Network trained on the transit data to model city-wide road traffic in real-time.
  • I also analyzed large-scale geospatial smart city data from domains such as transit management, air quality, and emergency services to draw pertinent insights using efficient visualization techniques, as part of government-funded research.
  • Some of my other tasks also involved machine learning (regression, boosting, clustering, etc.) and deep learning algorithms (CNNs, LSTMs etc.), mathematical and statistical approaches to solve problems such as vehicle traffic vs. air quality correlation analysis, traffic modeling, and name-entity recognition.
  • Our team collaboratively implemented a dashboard showcasing the above and illustrating the possibilities of effectively utilizing smart city data between city departments to the Ministry of Housing and Urban Affairs.
  • Tech Stack: Python, SQL, PySpark, TensorFlow, Docker, MinIO, Apache Spark, Superset, Flink, AirFlow, Kudu, Zeppelin

Data Science Research Intern

March 2021 - July 2021: India Urban Data Exchange (IUDX)

  • As a research intern I got hands my dirty in several ways. I refactored a custom Python software development kit to add a functionality that can be used to fetch both historical and real-time smart city data into Pandas data frames, with improved efficiency and authorization for company-wide usage.
  • I also studied the correlation between air quality sensors across the city, and performed geospatial interpolation of readings to visualize metrics over the entire city effectively.
  • One of my final tasks was developing a multi-class classification model using recurrent neural networks that classify grievances submitted by citizens to the city municipality in Gujarati through an online portal, by translating them to English first. This showcased the possibilitiy of accelerating the processing time of posted grievances by utilizing data that was already available.
  • Tech Stack: Python, Pandas, SQL, SciKit, TensorFlow, HuggingFace, Docker

Research Experience

Graduate Student Researcher

March 2024 - Present: Center for Applied Internet Data Analysis (CAIDA)

  • Refined a distributed system using AWS Lambda for querying Ookla speed test servers, optimizing request handling to mitigate IP blocking and ensure reliable data collection with dynamic querying and load distribution strategies, enhancing efficiency and performance.
  • Analyzed and visualized Ookla Open Data, examining global internet performance metrics and server data trends.
  • Developed a Go script utilizing Go threads to efficiently query network measurement data through GCP BigQuery, collecting millions of traceroute records from various ISPs for research analysis.

Teaching Experience

Teaching Assistant

Spring 2024: CSE 256 Statistical Natural Language Processing

  • This class serves as an introduction to NLP with Deep Learning Algorithms at a graduate level. Under Prof. Ndapa Nakashole, my main responsibilites include assisting students with doubts and grading assignments.

Teaching Assistant

Winter 2024: CSE 203B Convex Optimization Algorithms

  • This is a graduate level math course, that dealt with convex formulations and optimization methods. I assisted Prof. CK Cheng in setting up and grading assignments and held weekly discussion sessions with students to guide them through the course.

Teaching Assistant

Fall 2023: CSE 258 Recommender Systems and Web Mining

  • This course served as an introduction to recommeder systems for graduate students. Under Prof. Julian McAuley, I assisted students with their queries and helped them grasp the course material suring office hours.

© 2024 Dhruthick Mohan