
May 2019 – May 2020
Raleigh, USA

Software Engineer / Data Scientist Intern


  • Developed PoC for a Demand Forcasting application using ARIMA and LSTM (Long-short-term-memory networks) models in Azure Databricks while using MLFlow for model management. The project helped in understanding the Latency tradeoffs compared to deployment on Local Machines.
  • Built Predictive Model application for Optimal Pricing of the entire product range of ABB’s IAMA Business using ML approaches (Bayesian Regression, Market Basket Analysis etc.) and including economic indicators. Integrated the Output of the algorithms to PowerBI dashboards in production environment. Helped in improving 10-15 percent revenue increase for this division.
    November 2017 – July 2018
    Kolkata, India

    Software Developer


  • Developed REST APIs, asynchronous queuing systems, front end and integration between server and client for dashboards.
  • Developed a Node application to simplify development and testing process by reading data sent by weather sensors and designed shell scripts to backup databases to Amazon EC2 servers. Also, developed Microservices for data migration tools. Built and managed Docker container clusters. Employed Kubernetes to orchestrate the deployment, scaling and management of Docker Containers on AWS, Elastic Search, Lambda functions and Kafka
  • Projects


    • Implemented modified Convolutional Neural Networks VGG-16, Google Net, Inception on Chest X-ray image dataset to detect lung disease in lung x-rays and achieved 0.8 accuracy to detect the diseases. Implemented the project on TensorFlow and PyTorch frameworks in Python.
    • – Using Word vectors generated by Google’s GloVe as an underlying data model, Developed and Compared CNN, RNN and HAN for the task of text classification of IMDB reviews dataset (Natural Language Processing)
    Neural Networks Course Project | Team Size: 1 | Fall’2018 - Spring’2019
    Technologies: Python

    Recommender Systems (CS 591)

    • Created an ALS based Recommender system using Apache Spark to suggest new musical artists to the user based on their listening history (implicit feedback) and performed parameter sweep to select optimal parameters. The model achieved a score of 95.294% for rank 10
    • – Used Deep-walk to generate random walks over the heterogeneous information network(low dimensional graph produced vectors implemented on word2vec) to predict the user-movie pair on IMDB datasets.
    Recommender Systems Course Project | Team Size: 1 | Spring’2019
    Technologies: Apache SPark, Python.

    Bayesian and Longitudinal Modeling (ST 537, ST 540)

    • Using Random Coefficient Models of Bayesian linear regression and Spline Models of Longitudinal Regression predicted the International roughness index of the road network in North Carolina State.
    Bayesian Analysis Course Project| Team Size: 1 | Fall’2018 - Spring’2019
    Concepts Used: Bayseian Inferences


    List of courses taken during my Masters’s

    Fall 2019

    • CSE 519 - Data Science Fundamentals
    • CSE 590 - Introduction to Modern Cryptography
    • CSE 303 - Theory of Computation
    • CSE 373 - Analysis of Algorithms

    Spring 2020

    • CSE 544 - Probability & Statistics For Data Scientists
    • CSE 320 - System Fundamentals II
    • CSE 307 - Principles of Programming Languages

    Fall 2020

    • CSE 512 - Machine Learning
    • CSE 532 - Theory of Database Systems
    • CSE 518 - Foundations of Human Computer Interaction