AIMultiple ResearchAIMultiple Research

Citizen Data Scientists: 4 ways to democratize data science [2024]

Analytics vendors and non-technical employees are democratizing data science. Organizations are looking at converting non-technical employees into data scientists so that they can combine their domain expertise with data science technology to solve business problems.

What does citizen data scientist mean?

Citizen data science is a term initiated by Gartner. They define it as “a person who creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics.”

In short, they are non-technical employees who can use data science tools to solve business problems.

Citizen data scientists can provide business and industry domain expertise that many data science experts lack. Their business experience and awareness of business priorities enable them to effectively integrate data science and machine learning output into business processes.

Why are citizen data scientists important now?

Interest in citizen data science is almost tripled between 2012-2022, as seen below.

Source:Google Trends

Reasons for this growing interest are:

data scientist shortage graph
Source: QuantHub
  • As with any short supply product in the market, data science talent is expensive. According to the U.S. Bureau of Labor Statistics, the average data science salary is $101k.
  • Analytics tools are easier-to-use now, which reduces the reliance on data scientists.

Most industry analysts are also highlighting the increased role of citizen data scientists in organizations:

  • IDC big data analytics and AI research director Chwee Kan Chua mentions in an interview: “Lowering the barriers to allow even non-technical business users to be ‘data scientists’ is a great approach.”
  • Gartner defined the term and is heavily promoting it

What are the tools used by citizen data scientists?

Various solutions help businesses to democratize AI and analytics:

  • Citizen data scientists first need to understand business data and access it from various systems. Metadata management solutions like data catalogs or self-service data reporting tools can help citizen data scientists with this.
  • Automated Machine Learning (AutoML): AutoML solutions can automate manual and repetitive machine learning tasks to empower citizen data scientists. ML tasks AutoML tools can automate are
    • Data pre-processing
    • Feature engineering
    • Feature extraction
    • Feature selection
    • Algorithm selection & hyperparameter optimization
  • Augmented analytics /AI-driven analyticsML-led analytics, where tools extract insights from data in two forms:
    • Search-driven: Software returns with results in various formats (reports, dashboards, etc.) to answer citizen data scientists’ queries.
    • Auto-generated: ML algorithms identify patterns to automate insight generation.
  • No/low-code and RPA solutions minimize coding with drag-and-drop interfaces which helps citizen developers place the models they prepare in production.

If you are looking for vendors for these solutions, feel free to check out related vendor lists:

What are best practices for citizen data science projects?

Create a workspace where citizen data scientists and data science experts can work collaboratively

Most citizen data scientists are not trained in the foundations of data science. They rely on tools to generate reports, analyze data, create dashboards or models. To maximize citizen data scientists’ value, you should have teams that can support them which also includes data engineers and expert data scientists.

Train citizen data scientists

Though citizen data scientists’ knowledge of the business is advantageous for the business, their inexperience in data science makes projects prone to errors. Citizen data scientists could be trained in the following areas:

  • use of BI/autoML tools for maximum efficiency
  • data security training to maintain data compliance
  • detecting AI biases and creating standards for model trust and transparency so that citizen data scientists can establish explainable AI (XAI) systems.

Classify datasets based on accessibility

Due to data compliance issues, all data types should not be accessible to all employees. Classifying data sets that require limited access can help overcome this issue.

Create a sandbox for testing

Sandboxes, software testing environment, which include synthetic data and which are not connected to production environments help citizen data scientists quickly test their models before rolling them to production.

If you still have questions on citizen data science, don’t hesitate to contact us:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Cem Dilmegani
Principal Analyst

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

1 Comments
Susan Obi
Jan 29, 2021 at 17:53

One of the most commonly overlooked factors in a Citizen Data Scientist initiative is providing business users with a foundation from which to jump start their involvement. A basic course that explains the roles, new ways of collaboration, the use of augmented analytics and analytical techniques is an imperative if you want to get the initiative going in a positive direction. Remember that the business must make the cultural changes required to shift roles and to encourage data literacy and digital transformation.

Cem Dilmegani
Jan 29, 2021 at 18:24

Thanks for the comment. Link removed as per policy. We are fans of learning on the job however good to hear that courses are also being built on the topic.

Related research