Interest in data science grew >5x during the last 5 years as you can see above.
However, it is still not clear to many how data science consulting is different than regular consulting. After all, consulting is supposed to be about making data-driven decisions. A critical difference is that data science consultants leave their clients with reusable operational models. However, most regular consulting projects answer important but one-off questions and do not leave clients with operational decision-making models.
What is data science consulting?
Data science consulting is the activity to effect change by building up the client’s analytics skills, developing competencies, and understanding of the machinations of their business.
This process can be categorized into four main headings. Strategy, consulting, development, and training.
The strategy part of the consulting explores what’s possible with data and aims to create a plan.
This part requires extensive knowledge regarding the use cases. Depending on the client’s industry the data collection method, regulation, and objectives can be completely different. For one case, the objective can be optimizing the energy consumption of a plant, which can be achieved through collecting the data through machinery and getting the necessary paperwork from the business owner itself, whereas for an FMCG firm, trying to create a data pipeline to maximize the sales, the data collection can be limited by red tape, consumer protection and personal data protection requires considering the legal side of the work.
Collaboration between different departments is key to success. Business and IT side of the client need to be present for the definition and possible solution for the problem. The nature of data science makes the process more interdisciplinary and interdepartmental.
Strategy usually answers the following questions;
- What to do?
- What to collect?
- How to collect it?
- Where to store it?
- How to protect it?
- How to implement the solution?
Validation step is necessary to validate the identified strategy. While creating the strategy can be completed in hours in urgent cases, implementation can take months. Therefore, it is important to validate the strategy.
Validation is a natural step in finalizing the strategy. However, this may cause a conflict of interest if the validity of the model is evaluated by the same people providing the consultation. In most consulting projects, in the interest of time, the same team both builds and validates the strategy. Having another team for validation would require them to start analysis from almost scratch, creating significant inefficiencies. Separation of strategy and its validation at least makes it easier to find and spot the problems in the strategy and clarify how the validation step improved the strategy.
Validation includes answering these questions:
- What is the insight behind this strategy?
- What is a low cost way to test this strategy without fully implementing its findings?
- What do tests tell about the validity of the strategy?
Development is the activity of designing and building a modern data product or internal tool. This is more like the IT part of data science consulting. Custom-tailored solutions for specific problems require a heavy emphasis on the development process.
This part has 3 main subjects as Steve Ballmer previously stated:
Developers! Developers! Developers!
Training is boosting the data literacy of the client’s team. This would make sure the rest of the team is aware of the process and integrated into an improvement of the system. This would also ensure that the team would be able to capture the main points and provide a meaningful contribution to continuous improvement of the entire process. Feedback mechanisms can function well if the staff can provide the real-time effectiveness of the data mechanism.
How do data science consultants work?
Top management consultants like McKinsey have been putting significant effort into modernizing their project management approaches for data science. Their frameworks are similar to the ones we outlined above but it would be good to look at the areas they emphasize.
Below, you can see how McKinsey approaches advanced analytics/data science consulting:
Source of Value
Everything starts with the problem definition. The problem of most data science projects is finding a new opportunity that will enforce revenue growth and performance improvement. Consultants can also help in this step by identifying key value creation opportunities powered by analytics/data science. Most common use cases are improving customer-facing activities, optimizing internal processes with data-driven insights and expanding clients’ portfolio of offerings.
Consultants look for data sources to use in the project to achieve their aim of unlocking sources of value. Along with leveraging existing data sources, data science consultants can use data from 3rd party sources as well as data that is not widely used in analytics such as raw data from IoT devices and sensors.
Data science consultants either build new data models or select from existing models specific for the client’s problem. These models are tested on client’s data to uncover insights.
Turning Insights into Actions
With the results of their models, consultants create a feasible action plan that will include both process and technology changes. These steps can also include rolling out models built during the project to empower operational decisions.
Adoption of Technology
Data science consultants should be aware that their clients may not have a data-driven culture and be ready to adapt to new tools. Consultants spend time on training of client’s employees and ensuring implementation of the prescribed actions
Optimization of Organization and Governance
Lastly, consultants help build data governance and IT infrastructure to ensure that organizations can have lasting performance improvement. Performance improvements that do not address governance aspects of change tend to be short lived.
Data Science Consulting Industry
The industry players can be categorized into four types. These are MBB, Historical Tech Companies, Start-ups, and Big-Data-Big-Companies
These are traditional consulting firms. With their professional services, they have been serving their clients for a while. Now, they are updating and upgrading their activities with more data-supported services like advanced analytics.
McKinsey set up specialized teams for data analytics and there are some other ventures established specifically for this purpose. QuantumBlack is one of them. It was established in order to reimagine how organizations could continuously improve and outlearn their rivals. They provide services for various industries.
BCG set up BCG Gamma for their advanced analytics unit. BCG Gamma team comprises world-class data scientists and business consultants who specialize in the use of advanced analytics to get breakthrough business results. BCG Gamma combines advanced skills in computer science, artificial intelligence, statistics, and machine learning with deep industry expertise.
Bain provides its data science specific consulting activities through Bain Advanced Analytics Group. Their work focuses on three disciplines—primary research, advanced analytics and Big Data—and is rooted in our technical expertise, client experience and our knowledge of the latest data collection, analysis platforms and tools. We bring the right mix of disciplines to each client, recognizing that every challenge is unique.
Tech Consulting Companies
This category’s most important players are IBM and Accenture.
Accenture Analytics provides Big Data and related Technology services to businesses and organizations seeking to harness the power of big data analytics. Accenture invests heavily in R&D, academic alliances, and incubation of emerging technologies to advance the industry’s thinking around big data and analytics. Our 900+ data scientists currently serve more than 2,000 analytics clients, 70 of which are Fortune Global 100 companies. To date, we have helped more than 50 global clients use their data to generate data equity, business value, and competitive advantage.
IBM provides Big Data Consulting services. Big Data Services provides strategy, engineering, portfolio, and organization services to support your big data efforts. These services include implementing and providing ongoing maintenance, enhancement and support of big data, analytics and cognitive solutions and capabilities.
Data science startups are emerging quickly because of interest in the industry and because the data science landscape is still rapidly changing. Rapid growth is creating opportunities for new companies to capture emerging markets.
bitgrit is a data consulting company that helps companies identify data science use cases, build high performing solutions and hire data scientists to build their in-house teams. bitgrit combines its consulting services with data science competitions helping companies find solutions to complex AI problems using wisdom of hundreds of data scientists.
Datascope Analytics, a data consulting company which was acquired by IDEO in 2017, is such a firm. They work closely with their clients, using creative processes inspired by the design community to help clients identify valuable and innovative ways to use data. They also make these ideas a reality, building out everything from quick proofs of concept to scalable production systems.
These are large, big data native firms. Cloudera is one of them. Cloudera, the commercial Hadoop company, develops and distributes Hadoop, the open-source software that powers the data processing engines of the world’s largest and most popular websites. Founded by Facebook, Google, Oracle and Yahoo alumni, Cloudera’s mission is to bring the power of Hadoop, MapReduce, and distributed storage to companies of all sizes in the enterprise.
Palantir Technologies Inc. develops and builds data fusion platforms for public institutions, commercial enterprises, and non-profit organizations worldwide. The company offers Palantir Gotham, a platform that integrates, manages, secures, and analyzes enterprise data; and Palantir Metropolis, a platform that integrates, enriches, models, and analyzes quantitative data.
4 Factors to Consider when Choosing Data Science Consultant
Do the team members have advanced degrees?
This is one of the major factors for deciding who to work with. Data science is increasingly becoming an industry dominated by people claiming to be a data scientist. Usually, a Ph.D. is one of the best functioning proxies for that. Jonny Brooks-Bartlett’s guide shows the journey of academic becoming a data scientist. These are showing how much dedication is needed.
Do they have enough experience?
References matter. It is also important to see that the consultants also experienced a project in a similar setting. This also shows that the consultant can put meaningful insight and knows the practices in the specific industry. Organizations need to examine consultants’ previous projects to see that they have expertise in the following approaches:
Can they provide a long-term plan?
You need to make sure that the plan provided by the consultant is viable and can be upgraded regularly. Data science is a field experiencing constant improvement so it would be important to see the potential they can provide. Think about it as a long-term investment, you may need consulting again and updates so make sure they can provide the greater planning horizon
Do they have analytics translators on the team?
Technical capabilities of a data scientist are important for consultants as long as they can turn insights into actionable decisions. Analytics translators work with the data science team and combine their findings with business domain expertise to create actionable decisions.
Translators should be able to interpret and translate analytics insights into business benefits and guide the analytics work. To achieve this goal, these consultants should have domain knowledge, technical fluency, project management skills and an entrepreneurial spirit.
Pitfalls for data science projects
Kaggle, the data science competition community, had a survey asking data scientists the barriers they faced at work. Most of their answers shed light on the things that can go wrong on data science projects:
Out of these problems, 3 categories are relevant for data science projects:
- Data related issues
- Dirty data
- Data unavailable or difficult to access
- Privacy related issues
- Organization/project related issues
- Lack of management support
- Lack of clear questions to answer
- Result not used by decision makers
- Lack of domain experts
- Need to coordinate with IT
- Integrating findings into decisions
- Tool limitations
In summary, your data science project is as good as your data and your organization. With high-quality data and a committed organization, you would already remove most important barriers to data scientists’ efficiency.
If you have access to a data which you would like to use to build a machine learning model:
We can also help you find data science consultants even if you haven’t identified your machine learning problem yet:
How can we do better?
Your feedback is valuable. We will do our best to improve our work based on it.