JustinBates

Feel free to shoot me an email jujbates@gmail.com

Hard Copy Resume Here

Experienced data engineer with a strong track record of three years at a prominent data consulting firm. Skilled in Python, SQL, and proficient in data analysis. Specialized in designing, maintaining, and optimizing data infrastructure for comprehensive data lifecycle management. Excited to apply my expertise in a dynamic and challenging setting, utilizing my problem-solving prowess to deliver impactful solutions.

Projects

Beer Recommendation Engine

https://github.com/jujbates/Beer-Recommendation-System

• Developed a beer recommendation engine with collaborative filtering techniques, namely K Nearest Neighbors and Singular Value Decomposition.

• Achieved low error rates (RMSE: 0.1317, MAE: 0.0547) with KNN model through 5-fold cross-validation, demonstrating effectiveness in predicting beer preferences.

• Leveraged the combined power of KNN and SVD to provide more relevant recommendations, indicating potential for improved user satisfaction and exploration of diverse beer options.

Report: Beer_Recommendation_Engine_Project_Report.pdf

Presentation: Beer_Recommendation_Engine_Presentation_Report.pdf

S&P Portfolio Optimization

https://github.com/jujbates/Portfolio-Optimization

• Designed an optimized portfolio aimed to maximize sharpe ratio, outpacing S&P 500 by over 10% in cumulative return with 2-year data analysis.

• Utilized web scraping and data preprocessing techniques to gather adjusted close prices of S&P 500 companies for portfolio optimization.

• Implemented bootstrap and Gibbs sampling methods to evaluate portfolio performance, achieving a significant increase in return.

Report: S&P_Portfolio_Optimization_Report.pdf

Presentation: S&P_Portfolio_Optimization_Presentation.pdf

Big Mountain Resort Analysis

https://github.com/jujbates/Big-Mountain-Resort

• Using data analysis and predictive modeling, the project recommends a weekend ticket price increase to $86 and extending the season to 130 days to achieve the desired profit margin.

• Leveraged the combined power of KNN and SVD to provide more relevant recommendations, indicating potential for improved user satisfaction and exploration of diverse beer options..

Report: Whitefish_Mountain_Resort_Project_Report.pdf

Presentation: Whitefish_Mountain_Resort_Presentation.pdf

Bike Sharing Usage Prediction Model

https://github.com/jujbates/Bike-Sharing-Usage-Prediction-Model

• Generated neural networks without a framework to analyze bike-sharing data and generate sales predictions.

IMDB Sentiment Analysis Model

https://github.com/jujbates/IMDB-Sentiment-Analysis-Model

• Designed and developed neural network models with PyTorch to classify the sentiment of IMDB film reviews.

• Deployed an endpoint on AWS API Gateway to receive user data and sends it to AWS Lambda to process user data and sends it to a deployed model ‘s endpoint on AWS Sagemaker. The model classifies a user’s review as positive or negative

Experience

Data Scientist

ProductOps, Santa Cruz, CA

• Orchestrated end-to-end data management, including ingestion into data lake, ETL processes, and data warehousing using AWS Glue and Athena.

• Leveraged AWS CDK and implemented a robust CI/CD pipeline to ensure seamless operations throughout the data management lifecycle.

• Implemented predictive models using AWS technologies (Kinesis Data Streams, Glue, Redshift, Athena, Step Functions, Sage Maker) to revolutionize machine repair downtime forecasting and real-time outlier detection in sensor data.

• Improved processing efficiency of claim data, optimizing the workflow from sales at the manufacturer level to installation by on-site contractors, and submission to utility companies nationwide, leading to streamlined operations and enhanced turnaround times.

January 2021 - September 2023

Field Special Data Analyst

Aceolution @ Google (Google Maps), Santa Cruz, CA

• Collected and labeled location ground truth data for various projects to allow Google Engineers to improve models for Google Maps tools.

• Post-processing on GIS and GNSS datasets involves outlier detection, smoothing noisy data, interpolation to fill missing data, and data validation for ensuring reliable analysis results.

July 2019 - November 2019

Full Stack Engineer

Warrior Media Inc. (Health & Fitness E-commerce Startup), Santa Cruz, CA

• Over the first 12 months, efforts contributed to 3x subscriber growth and 200%+ revenue increase.

• Designed, developed, launched, and managed numerous apps for 3 corresponding company brands.

• Managed app development, test, and production deployment pipeline with Heroku and AWS.

• Developed dashboards and ETL workflow to process our KPIs such as CPA, CPC along with A/B testing results with various data sources.

March 2017 - August 2018

Security Intern

Seagate Technology (eSecurity), Scotts Valley, CA

• Developed an internal tool that analyzed network traffic in order to generate security reports.

• Trained international colleagues to operate the network analyzer and how to read and gain insights from the generated reports.

June 2016 - September 2016

Undergrad Researcher, University California: Santa Cruz

Genomic Data Engine Project, Santa Cruz, CA

• Developing Genomic Data Engine with the Matchmaker Exchange API to sync to the GA4GH’s databases.

• Converting Matchmakers Exchanges JSON objects to MySQL to optimize data store.

• Configuring Seagate’s Kinetic drives using DHCP to access drives over a private network.

• Designing object-oriented database with Seagate’s Kinetic API to organize and store genomic data.

September 2015 - December 2016

Programming Tutor/Grader

University California: Santa Cruz Undergraduate Courses, Santa Cruz, CA

• Taught younger undergraduate students unfamiliar programming concepts in C, Python, and Java.

• Scheduled lesson plans to hold smooth, understandable tutor session.

• Graded with a greater understanding of the material than the students to give out advice and comments along with a grade.

October 2014 - June 2016

Web Developer Intern

Business Application Technology Services Department, City National Bank, Los Angeles, CA

• Collaborated with colleagues outside the technology services to enhance existing applications using C# with ASP .NET and MS SQL Server Management Studios.

• Documented other workflow applications that outsourced contractors wrote for the bank.

June 2015 - September 2015

Education

University of California: Santa Cruz

Bachelor of Science Computer Engineering

Minor Computer Science

August 2012 - March 2017

Certificates

Udacity Nanodegree

Deep Learning - Certificate Link Here

January 2019

Springboard Bootcamp

Data Science Career Track

May 2020 - December 2020

Skills

Programming Languages & Tools

Languages:
Python, SQL, JavaScript, HTML, CSS, C, C#, Java

Applications:
Jupyter Notebook, Excel, Power IB, Access, Tableau, AWS (EC2, SageMaker, Lambda, ECS)

Relevant Python Libraries:
pandas, numpy, django, pytorch, keras, sklearn, matplotlib, sqlalchemy

Minimal Experience
C++, MatLab, tensorflow, JD Edwards ERP