Interest in data science grew >5x during the last 5 years, as you can see above. According to PwC, there will be more than 2.9 million job postings for data science and analytics roles in the US alone by 2020. Yet, data science talent is scarce; that’s why businesses who lack data science talent may need to rely on data science consulting companies
Though both data science consulting and regular consulting is about making data-driven decisions, the critical difference is that data science consultants leave their clients with reusable operational models. However, most regular consulting projects answer important but one-off questions and do not leave clients with operational decision-making models.
What is data science consulting?
Data science consulting is the activity to effect change by building up the client’s analytics skills, developing competencies, and understanding of the machinations of their business.
Data science consulting firms provide 4 services to companies. These services are:
- Strategy building
- Validation of strategy
- Model development
- Employee training
The strategy part of the consulting explores what’s possible with data and aims to create a plan.
This part requires extensive knowledge regarding the use cases. Depending on the client’s industry, the data collection method, regulation, and objectives can be completely different.
For one case, the objective can be optimizing the energy consumption of a plant, which can be achieved through collecting the data through machinery and getting the necessary paperwork from the business owner itself, whereas for an FMCG firm, trying to create a data pipeline to maximize the sales, the data collection can be limited by red tape, consumer protection and personal data protection requires considering the legal side of the work.
Collaboration between different departments is the key to success. The client’s business and IT side need to be present for the definition and possible solution for the problem. The nature of data science makes the process more interdisciplinary and interdepartmental.
The strategy usually answers the following questions;
- What to do?
- What to collect?
- How to collect it?
- Where to store it?
- How to protect it?
- How to implement the solution?
The validation step is necessary to validate the identified strategy. While creating the strategy can be completed in hours in urgent cases, implementation can take months. Therefore, it is important to validate the strategy.
Validation is a natural step in finalizing the strategy. However, this may cause a conflict of interest if the validity of the strategy is evaluated by the same people providing the consultation.
In most consulting projects, in the interest of time, the same team builds and validates the strategy. Having another team for validation would require them to start the analysis from almost scratch, creating significant inefficiencies. Separation of strategy and its validation makes it easier to find and spot the problems in the strategy and clarify how the validation step improved the strategy.
Validation includes answering these questions:
- What is the insight behind this strategy?
- What is a low-cost way to test this strategy without fully implementing its findings?
- What do tests tell about the validity of the strategy?
Development is the activity of designing and building a modern data product or internal tool. This is more like the IT part of data science consulting. Custom-tailored solutions for specific problems require a heavy emphasis on the development process.
This part has 3 main subjects, as Steve Ballmer previously stated:
Developers! Developers! Developers!
Training is boosting the data literacy of the client’s team. This would make sure the rest of the team is aware of the process and integrated into an improvement of the system. This would also ensure that the team would capture the main points and provide a meaningful contribution to the continuous improvement of the entire process. Feedback mechanisms can function well if the staff can provide the real-time effectiveness of the data mechanism.
How do data science consultants work?
Top management consultants like McKinsey have been putting significant effort into modernizing their data science project management approaches. Their frameworks are similar to the ones we outlined above, but it would be good to look at the areas they emphasize.
Below, you can see how McKinsey approaches advanced analytics/data science consulting:
Source of Value
Everything starts with the problem definition. The problem of most data science projects is finding a new opportunity that will enforce revenue growth and performance improvement. Consultants can also help in this step by identifying key value creation opportunities powered by analytics/data science. The most common use cases are improving customer-facing activities, optimizing internal processes with data-driven insights, and expanding clients’ portfolio of offerings.
Consultants look for data sources to use in the project to unlock the value of data sources.
Data sources that data science consultants can use are:
- existing data sources such as organizations’ CRM systems
- 3rd party sources from data marketplaces or other data providers
- raw data sources from IoT devices and sensors
Data science consultants either build new data models or select from existing models specific to the client’s problem. These models are tested on the client’s data to uncover insights.
Turning Insights into Actions
With their models’ results, consultants create a feasible action plan that will include both process and technology changes. These steps can also include rolling out models built during the project to empower operational decisions.
Adoption of Technology
Data science consultants should know that their clients may not have a data-driven culture and be ready to adapt to new data science tools. Consultants spend time on training of client’s employees and ensuring implementation of the prescribed actions.
Optimization of Organization and Governance
Lastly, consultants help build data governance and IT infrastructure to ensure that organizations can have lasting performance improvement. Performance improvements that do not address governance aspects of change tend to be short-lived.
Necessary Skills for Data Science Consultants
Below image from AltexSoft highlights what skills are required to be a data scientist. Required and preferred skills can be categorized as follows:
- Coding languages
- Data management skills
- Knowledge of pre-existing ML algorithms and models
- Knowledge of frameworks and libraries
- TensorFlow for neural networks
- Skicit-learn for machine learning
- Experience in the industry
- Enthusiasm for problem-solving
Cases where hiring a data science consulting agency is a better option
Data science projects can be handled via the following approaches:
- hiring consulting companies
- developing solutions with an in-house team
- crowdsourcing model development through data science competitions
- hiring freelance developers
Companies can choose either option, yet, each approach has pros and cons depending on the business’ industry, objectives, and budget.
Data science consulting is more advantageous than other approaches when
- There is no suitable off-the-shelf solution for your use case: If companies have specific needs and existing off-the-shelf solutions do not meet those expectations, consulting companies can help build customized products so that businesses eliminate or minimize off-the-shelf solution risks such as costly customization projects.
- Budget is not enough to build an in-house team: A data science team includes roles such as Chief Data Officer, data analyst, business analyst, data scientist, data architect, data engineer, etc. Building such a team is an expensive approach considering an average salary of a single data scientist working in-house is $94,000.
- Data science projects don’t require unique proprietary data: If your case and data are not unique, then consultants probably worked with similar data before. Their experience can help accelerate your projects faster.
- Data set does not contain sensitive information: Companies must be careful before sharing data with third parties due to data privacy regulations. Methods such as synthetic data generation and data masking can help companies make their data ready for sharing.
- Your company needs guidance on identifying the business aspects of data science projects: This is why consulting firms are still popular. Most companies are specialized in the market, and their knowledge of strategy and implementation of projects is limited. Consultants help identify business processes where data science projects can be implemented.
For more information on model development approaches, please check our guide on the ideal way to build AI projects.
Data Science Consulting Industry
The industry players can be categorized into four types. These are
- Historical Tech Companies,
These are traditional consulting firms. With their professional services, they have been serving their clients for a while. Now, they are updating and upgrading their activities with more data-supported services like advanced analytics.
McKinsey set up specialized teams for data analytics, and there are some other ventures established specifically for this purpose. QuantumBlack is one of them. It was established to reimagine how organizations could continuously improve and outlearn their rivals. They provide services for various industries. They also provided some open-source libraries that they use in client projects for data scientists.
BCG set up BCG Gamma for their advanced analytics unit. BCG Gamma team comprises world-class data scientists and business consultants who specialize in advanced analytics to get breakthrough business results. BCG Gamma combines advanced computer science skills, artificial intelligence, statistics, and machine learning with deep industry expertise.
Bain provides its data science specific consulting activities through Bain Advanced Analytics Group. Their work focuses on three disciplines—primary research, advanced analytics, and Big Data—and is rooted in our technical expertise, client experience, and knowledge of the latest data collection, analysis platforms, and tools. We bring the right mix of disciplines to each client, recognizing that every challenge is unique.
Tech Consulting Companies
This category’s most important players are IBM and Accenture.
Accenture Analytics provides Big Data and related Technology services to businesses and organizations seeking to harness the power of big data analytics. Accenture invests heavily in R&D, academic alliances, and incubation of emerging technologies to advance the industry’s thinking around big data and analytics. Their 900+ data scientists currently serve more than 2,000 analytics clients, 70 of which are Fortune Global 100 companies. To date, they have helped more than 50 global clients use their data to generate data equity, business value, and competitive advantage.
IBM provides Big Data Consulting services. Big Data Services provides strategy, engineering, portfolio, and organization services to support your big data efforts. These services include implementing and providing ongoing maintenance, enhancement, and support of big data, analytics, and cognitive solutions and capabilities.
Data science startups are emerging because of interest in the industry, and they are changing the data science landscape. Rapid growth is creating opportunities for new companies to capture emerging markets.
bitgrit is a data consulting company that helps companies identify data science use cases, build high performing solutions, and hire data scientists to build their in-house teams. bitgrit combines its consulting services with data science competitions helping companies find solutions to complex AI problems using the wisdom of hundreds of data scientists.
Datascope Analytics, a data consulting company that was acquired by IDEO in 2017, is such a firm. They work closely with their clients, using creative processes inspired by the design community to help clients identify valuable and innovative ways to use data. They also make these ideas a reality, building everything from quick proofs of concept to scalable production systems.
These are large, big data native firms. Cloudera is one of them. Cloudera, the commercial Hadoop company, develops and distributes Hadoop, the open-source software that powers the world’s largest and most popular websites’ data processing engines. Founded by Facebook, Google, Oracle, and Yahoo alumni, Cloudera’s mission is to bring the power of Hadoop, MapReduce, and distributed storage to companies of all sizes in the enterprise.
Palantir Technologies Inc. develops and builds data fusion platforms for public institutions, commercial enterprises, and non-profit organizations worldwide. The company offers Palantir Gotham, a platform that integrates, manages, secures, analyzes enterprise data; and Palantir Metropolis, a platform that integrates, enriches, models, and analyzes quantitative data.
4 Factors to Consider When Choosing Data Science Consultant
4 criteria can help choose the right data consulting partner:
- Academic degree of consultants
- Duration of service they offer
- Analytics knowledge
Here are the questions you should be asking:
Do the team members have advanced degrees?
This is one of the major factors for deciding who to work with. Data science is increasingly becoming an industry dominated by people claiming to be a data scientist. Usually, a Ph.D. is one of the best functioning proxies for that. Jonny Brooks-Bartlett’s guide shows the journey of academic becoming a data scientist. These are showing how much dedication is needed.
Do they have enough experience?
References matter. It is also important to see that the consultants also experienced a project in a similar setting. This also shows that the consultant can put meaningful insight and knows the practices in the specific industry. Organizations need to examine consultants’ previous projects to see that they have expertise in the following approaches:
Can they provide a long-term plan?
You need to make sure that the consultant’s plan is viable and can be upgraded regularly. Data science is a field experiencing constant improvement, so it would be important to see its potential. Think about it as a long-term investment, you may need consulting again, and updates so make sure they can provide the greater planning horizon.
Do they have analytics translators on the team?
A data scientist’s technical capabilities are important for consultants as long as they can turn insights into actionable decisions. Analytics translators work with the data science team and combine their findings with the business domain expertise to create actionable decisions.
Translators should be able to interpret and translate analytics insights into business benefits and guide the analytics work. These consultants should have domain knowledge, technical fluency, project management skills, and an entrepreneurial spirit to achieve this goal.
Salaries of data science consultants
Salaries of data science consultants vary based on experience and location. According to Neuvoo, these are the average data science consultant salaries by country. The top and bottom end of the ranges can help you understand how experience impacts salary:
|Country||Median Salary (per year)||Lowest Salary (per year)||Highest Salary (per year)|
|India||₹ 1,287,500||₹ 216,000||₹ 1,750,000|
Pitfalls for data science projects
Kaggle, the data science competition community, had a survey asking data scientists the barriers they faced at work. Most of their answers shed light on the things that can go wrong on data science projects:
Out of these problems, 3 categories are relevant for data science projects:
- Data related issues
- Dirty data
- Data unavailable or difficult to access
- Privacy-related issues
- Organization/project related issues
- Lack of management support
- Lack of clear questions to answer
- Result not used by decision-makers
- Lack of domain experts
- Need to coordinate with IT
- Integrating findings into decisions
- Tool limitations
In summary, your data science project is as good as your data and your organization. With high-quality data and a committed organization, you would already remove the most important barriers to data scientists’ efficiency.
If you have access to data which you would like to use to build a machine learning model:
We can also help you find data science consultants even if you haven’t identified your machine learning problem yet:
How can we do better?
Your feedback is valuable. We will do our best to improve our work based on it.