Uday Krishna
SDE III at Fanatics
Programmer who knows ML, I spend my free time building stuff or building bots to do my stuff.
Experience



SDE III Jan 2021 - present
Fanatics Ecommerce
  1. Created an automatic DAG generation program for better explaining the selection logic for stock replenishment.
  2. Implemented Python Pulsar Client library from scratch with JWT capability to get authorization for consuming or producing to a topic. Additionally added KMS based encryption and decryption support to the client to handle PII data during transit.
  3. Designed and implemented automatic schema detection and decoding capabilities to pulsar client library to allow for a more seamless decoding of messages by utilizing pulsar schema registry to get fetch the schema used to encode the message. This allowed for reduction of schema incompatibility issues on consumers.
  4. Implemented a custom vault client for python to allow for connecting easier connection to internal vault instance with both EC2 and IAM auth.
  5. Designed Consumer Manager application to allow for easier management of pulsar consumers. The application abstracts out scaling dead letter queuing and logging, increasing the ease of writing consumers.
  6. Authored Pulsar Avro Code generator library for python, which takes in Avro schema and spits out pulsar compatible schema classes that can be used for development.
  7. Authored JSON-Schema based Avro schema validator with error correction suggestion capabilities that provides hints on how to correct the schema
  8. Authored custom python logging client library extending python logging to stream logs to internal Kafka logging pipeline, which are query able via GreyLog (elastic-search wrapper)
  9. Implemented a custom Apache arrow flight server in Golang to work as a sidecar for Informix to allow for faster data ingestions and updates utilizing external table capabilities. This allowed for 32% faster data throughput when updating large number of rows.

Data Scientist Jan 2019 - Dec 2020
Toppr.com
  1. Optimized similar questions service latency from ~3s to <300ms by improving OCR API inputs and elastic search queries even when dealing with questions with lots of text tokens.
  2. Designed Search History pipeline using kinesis and lambda to push search data to S3 data lake in real time and while using dynamo to store paginated searches history across different namespaces using bucket pattern. This helped us maintain the API latency to sub 10 ms without compromising on analytics use cases
  3. Improved latex/equation-based question support similar questions suggestions from 84% matching to 96% by creating custom mappings and adding decay functions to requisite queries.
  4. Deployed content search pipelines that asynchronously update searchable content reliably across a wide range of dynamically generated personalized syllabi
  5. Designed a zero maintenance serverless solution for solver app service using WhatsApp API backed by lambda and DynamoDB.
  6. Deployed Search pipelines to AWS step functions to improve monitoring of the seeding flow
  7. Designed a question recommendation evaluation interface to aid in identifying lapses in question recommendation quality
  8. Implemented a brand name identification model inspired by Nilesh Dalvi, Marian Olteanu, Manish Raghavan, and Philip Bohannon. 2014. Deduplicating a places database. In Proceedings of the 23rd international conference on World wide web (WWW '14). ACM, New York, NY, USA, 409-418.
  9. Designed a data pipeline to identify school name cluster using Expectation maximization to allow the marketing team to target user groups
  10. Contributer to NLTK; provided an implementation of Meteor Score "Lavie, A., & Agarwal, A. (2007, June). METEOR An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the Second Workshop on Statistical Machine Translation (pp. 228-231). Association for Computational Linguistics."
  11. Migrated older Solr cluster to containerized Solr cloud deployment which not only helped increase SLA from 76% to almost 99% but also made monitoring and maintenance a lot more easier.

Senior Data Scientist Apr 2018 - Dec 2018
Merilytics Inc
  1. Built and maintained a containerized job orchestrator for extracting sentiment, keywords and topics for processing news articles and news feed emails as both online/batch jobs
  2. Built maintained and improved a Containerized Realtime multi-threaded keyword extraction service Key Extraction using flask+ gunicorn deployed on docker swarm.
  3. Implemented a containerized concept extraction service which used a combination of previously mentioned keyword extraction techniques and n-gram extraction techniques based on frequent n-grams, RAKE and TextRank to extract uni, bi and tri-grams and linked them to a topic by querying DBpedia to convert the keywords to a referenceable topic.
  4. Built trend APIs using mongo db's aggregate queries to display current trends in supported industries by various time segments
  5. Built a containerized document clustering service with automatic cluster label generation capabilities in python.
  6. Built build and deployment pipelines for keyword extraction, concept extraction, document clustering and article mining services using bitbucket pipelines and docker-compose
  7. Build extendable targeted article mining platform using Scrapy with spiders that could mine both article text and article HTML content from over 20+ sites with very high accuracy.
  8. Built an email news feed mining application that could be used to mine and process email news feed in parallel to the article mining platform
  9. Built a multi-level text classification model using a series of semi-supervised and supervised learning methods to enable working with very low quantities of tagged data
  10. Speaker for data science at a company-wide knowledge management conference.
  11. Engineered features for multiple financial datasets (TU datasets) huge dataset with >200M each using Spark (pySpark) on data bricks.

Data Scientist Jul 2017 - Apr 2018
Merilytics Inc
  1. Built an end to end Conjoint analysis pipeline for a growing trip-based startup.
  2. Built an end to end service to perform market basket analysis on sales data using eclat on R and using MS SQL Server for importing and exporting data automatically for a large Brazilian supermarket chain.
  3. Built a topic extraction module based on gensim to quickly swap between LDA and LSI without affecting the format of the output.
  4. Made multiple keyword extraction modules to extract keywords from news and social data. Integrated each of these modules into the product pipeline as a Dockerized HTTP Service deployable via docker-compose using Flask and Gunicorn. Few of the key implementations are as follows.
    1. Implemented a keyword extraction module with multi-threaded training capability based on 'Parameswaran, Aditya, et al. “Towards the Web of Concepts. ”Proceedings of the VLDB Endowment, vol. 3, no. 1-2, 2010, pp. 566-577., doi:10.14778/1920841.1920914.' and extended it with ngram based NLP techniques to improve the quality and rank the keywords.
    2. Built a keyword extraction module based on SG-Rank to extract industry-relevant keywords from news articles.
    3. Built a Keyword extraction module for extracting keywords from scientific text using n-gram based techniques and textrank.
  5. Built a custom data tagging interface for data tagging based on pybossa specifically tuned for tagging articles for NLP related tasks, post which the data could be exported to csv, excel from which other pipelines could pick the data up for training models. These pipelines reduced our training time significantly
  6. Added Support for importing custom formats for the tagging platform mentioned above

Product Development Intern Jul 2016 - Dec 2016
Blue Yonder (formerly JDA Software)
  1. Designed UI Screens for JDA's Master Data Management suite using Ext.JS Following MVC Pattern
  2. Developed Code to display appropriate HTTP Status Codes on Spring Based Rest API of the Data Management Suit
  3. Designed client side Connector back-end for the UI in JAVA
  4. Engineered Unit Test Cases for the Master Data Management Suite.

Engineering Intern Jun 2015 - Jul 2015
Carborundum Universal Limited
  1. Investigated different steps required to change the configuration of the Poggi machine and the time taken for performing each step along with highlighting the most time consuming steps
  2. Provided suggestions on eliminating various redundant steps which facilitated the decrease of setup time by 23%
  3. Collected the data on time required to perform each task and number of workers needed to perform each task and authored a comprehensive report on optimizing each task during the setup of the Poggi machine.
  4. Investigated for leakage during the loading of the machine and provided suggestions to decrease spillage.

Education



Birla Institute of Technology and Sciences Pilani 2013 - 2017
Bachelor of Engineering (Hons.)
  1. Joint Director of a Not for Profit organization - Scio Benevolent Foundation, was a member of spearhead team that made it possible to reach more the 16,000 needy students.
  2. 2nd place in Ground reality Business plan competition, BITS-Pilani Hyderabad
  3. Winner HultPrizeat BITS-Pilani Dubai.

Sri Chaitanya Junior Kalasala 2011 - 2013
High School

    The Hyderabad Public School, Ramanthapur 2001 - 2013
    High School
    1. 1st in computer science during the academic year 2007-2008.
    2. 3rd in Science fair during the academic year 2009-2010.

    Honors



    Jul 2018
    Grand Prize winner - 6th Edition LVPEI Engineering the eye hackathon
    Successfully designed and built a cost-effective prototype ophthalmoscope to diagnose Retinopathy of Prematurity (ROP). ROP if not treated swiftly causes permanent blindness in infants within a matter of days. In India alone ~150K infants go undiagnosed and ~23M infants are at significant risk of ROP.

    Feb 2018
    1st Prize winner - Merilytics Hackathon
    Made a Microsoft teams based chatbot to summarize an article within any link provided. Additionally, all the summarized articles could be queried with simple chat based search commands.

    Mar 2016
    2nd Place in Ground Reality

    Dec 2015
    Finalists in innovation contest by TBI at Bits Pilani Hyderabad
    Was the finalist in innovation contest organised by Technology business incubator, Bits Pilani Hyderabad.

    Dec 2015
    Winner, hultprizeat Bits Pilani Dubai

    Dec 2008
    First in Computer Science
    Was awarded merit certificate for scoring the highest in computer science during the academic year 2007-2008

    Dec 2008
    3rd in Science fair
    3rd in Science fair during the academic year 2009-2010 for the project electronic watchdog to detect people using infrared light.

    Skills



    Python; R; Machine Learning; Data Analysis; Shell Scripting; Web Design; Chemical Engineering; Social Entrepreneurship; Android Development; Multithreading; Statistics; Regression Testing; Big Data; Web Applications; Linux; Microsoft Excel; Java; C; JavaScript; HTML; CSS; Photoshop; AutoCAD; Microsoft Word; PowerPoint; PHP; SQL; Microsoft Office; ANSYS; Matlab; MongoDB; Microsoft Azure; Amazon Web Services (AWS); keras; Pandas (Software); Web Scraping; Aspen Plus; Natural Language Processing; Regression Analysis; Containerization; Predictive Modeling; Business Insights; Go Lang

    Certifications



    Machine Learning by Andrew NG on Coursera (YS6ZMFEQLN23)