Deepak Krishnan

Data Scientist turned Veteran Mobile Developer.

Specialties: Machine Learning, Mobile Development, Big Data, Backend Programming

 

Listing down few of my projects

  • Natural Language Search : Given a data store, the module lets you perform natural language search on the fields of your choice. Behind the scenes, the module does a n-gram lookup against the fields. To identify the conditionals associated with an n-gram, productions are defined using a CFG.However, to avoid performance overheads, the system indexes the fields in a fast lookup disk structure and cache’s frequent lookup information on a in memory tree data structure.

  • Content ClassificationEngine : Machine Learning module wrapped as a REST-ful API that is being used to classify web pages into a nested taxonomy structure at AddThis.com. The engine has an ingress of petabyte order per month.

  • Content RecommendationEngine : A content recommendation solution developed for AddThis.com. It considers both collaborative filtering algorithms as well as contextual information to pitch back content to the user by looking for semantically related content.

  • Lothar : Event detection using custom built CFG  module. and entity recognition engine with millisecond response time per HTML page. Lothar can compile dynamically added productions into a finite automata unlike most existing grammar based solution that needs recompilation every time new productions are added. The system could thus be referred to as a grammar of grammar. Lothar has a large semantic knowledge graph inbuilt into it and is able to annotate incoming text using this knowledge graph. The knowledge graph as well as the event grammar is pluggable and can be replaced at will. Unlike other systems that do this, lothar uses a state machine model and this serves as the key differentiator in performance.

  • Hydra : Hydra is a distributed data storage and processing solution developed at AddThis. It can ingest data, transform data and built tree output from the transformed data. The tree structure can be queried as part of machine learning pipelines or for web integrations. Hydra is available here.

  • Venalogic.com  : Social Sports analytics for NFL. The application processes social media streams to detect the entity being spoken about and its associated sentiment. The application then uses this information to compute the top influencers and also a time series aggregated visualization of sentiment, popularity and number of mentions per player.

  • RSSPipe : A flume plugin that will import RSS feed elements into a HTable on HBase at configurable intervals. The mapping of fields in RSS to qualifiers in the HTable column family is configurable.

 

Industry :  Media and Advertising, Hi-Tech, Sports and Fitness, Politics, Real Estate

Specialization or business function Media and Advertising (Clickrate Optimization, Multi-Touch Attribution, Media Mix Analysis), R&D

Technical function Data Visualization (Dashboards & Scorecards, Statistical Graphics), Analytics (Predictive Modeling, Data Mining, Real-time Analytics, In-Memory Analytics, Machine Learning, Data Preparation, Regression Analysis, Natural Language Processing), Web Analytics, Data-Driven Application Development, Data-Driven Mobile Applications

Technology & tools Big Data and Cloud (Amazon Elastic MapReduce, Cloudera, Hortonworks, Pentaho, MongoDB, DataStax, Neo4j Graph Database, MySQL, PostgreSQL), In-Memory Appliances

 
 

Less

Additional Info

Location

Kochi, India

Avalability

Actively Looking

Skills and Technologies I Mastered