Position: Senior Data Scientist
Location: Newark, NJ
Duration: 6-12 Months of contract with possible extensions
The Data Scientist's will harness vast amounts of data to optimize business results. He/she will exercise their knowledge of descriptive and multivariate statistical techniques and applications, and database analysis tools and techniques to develop strategic insights to drive business goals. Works hard at advancing the state-of-the-art in how data is applied to solve problems in Public Transportation industry. Will also be responsible for teaching data analysts and software engineers how to work effectively with vast quantities of data.
Strategy & Planning
- Work with cross departmental team to define metrics, guidelines, and strategies for effective use of algorithms and data.
- Identify, design, and build appropriate datasets for identification of complex Data pattern and analytics.
- Create data mining and analytics architectures, coding standards, statistical reporting, and data analysis methodologies.
- Establish links across existing data sources and find new, interesting mash-ups.
- Coordinate data resource requirements between analytics, Application teams and Business Stakeholders.
- Work with product managers, engineers, and analytics team members to translate prototypes into production
- Assist in the development of data management policies and procedures.
- Develop best practices for analytics instrumentation and experimentation.
Acquisition & Deployment
- Conduct research and make recommendations on big data infrastructure, database technologies, analytics tools, services, protocols, and standards in support of procurement and development efforts.
- Well versed in Cloud technologies and Big Data platform.
- Drive the collection of new data and the refinement of existing data sources.
- Develop algorithms and predictive models to solve critical business problems.
- Analyze historical data, identify patterns come up with predictive, prescriptive, descriptive and cognitive analytics.
- Develop tools and libraries that will help analytics team members more efficiently interface with huge amounts of data.
- Will deep dive as the requirement demands into supervised machine learning (classification binary target variable). Expert in handling a suite of non-parametric classification algorithms (Decisions Tree, Rule-Based classifier & Naive Bayes) as well as parametric classification algorithms (Logistic Regression, Support Vector Machine, Nearest Neighbor Classifiers) with deeper use of R, python. Expert in Un supervised learning techniques. Knows the optimal method /technique to use in different situations.
- Analyze large, noisy datasets and identify meaningful patterns that provide actionable results.
- Develop and automate new enhanced imputation algorithms.
- Create informative visualizations that intuitively display large amounts of data and/or complex relationships.
- Provide and apply quality assurance best practices for data science services across the organization.
- Develop, implement, and maintain change control and testing processes for modifications to algorithms and data analytics.
- Collaborate with database and disaster recovery administrators to ensure effective protection and integrity of data assets.
- Manage and/or provide guidance to junior members of the analytics team.
Formal Education & Certification
- Minimum 5 years' experience in modeling and analysis Graduate or post graduate university degree in the field of computer science, mathematics, or statistics, Data Science and/or  years + equivalent work experience.
- Certifications in CAP highly desirable.
Knowledge & Experience
- Well versed with CRISP-DM methodology and reference model.
- Very Hands on. Has proven track record of successful Data Science implementations from industry preferably Public Transportation and Travel industry.
- Responsible for building advanced analytics and models for critical business areas including schedule optimization, anomaly detection of the routes, driver utilizations, GPS data and engine fault codes.
- Perform advanced quantitative and statistical analysis of large datasets to identify trends, patterns, and correlations that can be used to improve business performance.
- Develop scalable, efficient, and automated processes for large scale data analyses and model development, validation, and implementation.
- Utilization of machine learning and other techniques to prevent, detect/resolve route data and operational data anomalies requiring human intervention.
- Core and strategic responsibility to assess, plan and develop driver scoring algorithms and model.
- Collaborates with internal and external business and data teams, data engineers and analysts.
- Oversee processes within the Data and Analytics team to ensure sustainable and efficient deployment of algorithm trained models.
- Advanced knowledge one or more of the following: Linear and Non-Linear Models, Time Series Analysis, Random Forest, SVM, Neural Networks, Unsupervised Methods (Dimensionality Reduction, Clustering, etc.)
- Probabilistic Modeling and Computation.
- Machine Learning, including Deep Learning.
- Advanced knowledge in R and Python.
- Advanced experience with SQL skills.
- Must be able to perform impact data analysis and data profiling on source data systems and determine best way to model data.
- Employ data visualization tools where appropriate (ggplot, Tableau, ThinkCell, etc.).
- Clearly communicate project requirements and modeling outcomes to technical and non-technical audiences.
- Ability to work in a self-directed work environment.
- Ability to work in a team environment.
- Extensive experience solving analytical problems using quantitative approaches.
- Comfort manipulating and analyzing complex, high-volume, high-dimensionality data from varying sources
- Well versed with Git.
- Expert in using NLP, Ensemble methods.
- Do Exploratory Data analysis using SQL.
- Must have machine learning and AI expertise.
- Familiarity with relational, SQL and NoSQL databases.
- Expert knowledge of statistical analysis tools such as R, R studio, Matlab, Rapid Miner
- Good working knowledge of Reporting Analytical tools like Tableau, Power BI
- Experience with very large datasets a must.
- Knowledge of map/reduce framework (hive/pig other tools for accessing data in Hadoop/HBase cluster systems) a plus.
- Experience in programming with JAVA, Perl, C.
- Project management experience.
- Good understanding of the organization's goals and objectives.
- Knowledge of applicable data privacy practices and laws.
- A strong passion for empirical research and for answering hard questions with data.
- A flexible analytic approach that allows for results at varying levels of precision.
- Ability to communicate complex quantitative analysis in a clear, precise, and actionable manner.
- Good written and oral communication skills.
- Strong technical documentation skills.
- Good interpersonal skills.
- Highly self-motivated and directed.
- Keen attention to detail.
- Ability to effectively prioritize and execute tasks in a high-pressure environment.
- Strong customer service orientation.
- Experience working in a team-oriented, collaborative environment.
Jayabalaji | Delivery Manager Staffing Services
Mobile: 908-440-1557 | Email: [email protected]