AIMultiple ResearchAIMultiple Research

Guide to Automated Redaction & Its Benefits in 2024 

Data privacy is important and enshrined in many legal requirements, including processes for managing private data, such as the Health Insurance Portability and Accountability Act (HIPAA) and the Gramm-Leach-Bliley Act.1 2 

Complying with data protection laws and regulations can be challenging. EU data protection authorities have fined about $1.2 billion for violating the bloc’s General Data Protection Regulation since 2021, nearly 7 times the sum of last year’s fines.3 $9.44 million was the average cost of data breaches in the U.S. in 2022, and data theft or compromise accounted for about 19% of the total data breaches. 4

Automated redaction can assist in compliance with data protection laws and regulations. We recommend you learn about automated redaction before using it in your business. This article explains:

  • What is automated redaction, and how does it work?
  • Why is it important?
  • What is role-based redaction, and how does it work?
  • What are the benefits of role-based redaction?
  • What is AI redaction, and how does it work?
  • And what are the benefits of AI redaction?

What is automated redaction?

“Redaction” is the process of removing or blacking sensitive information from electronic documents (e-documents). Redaction can be applied to file types like text files or images. By using redaction techniques, organizations can keep confidential information from being seen by people who should not be able to see it.

This sensitive data can be things like a company’s internal valuation or information about a client during a lawsuit. Medical records, legal documents, and customer documents can also be among the redacted documents.

Personally identifiable information and sensitive personal information

In commercial businesses, sensitive data can be divided into personally identifiable information (PII) or sensitive personal information (SPI).

PII is information that can be used to identify an individual directly. Names and social security numbers (SSNs) are examples of this information. SPI, on the other hand, allows for the indirect identification of individuals. This information includes data regarding genes, sex, and finance.

What are the different types of redaction?

Redaction can be divided into three subcategories:

  • Image-based redaction: In image-based redaction, an image is put to obscure to PII or SPI (Figure 1).

Figure 1: Image-based redaction.5 

  • Text-based redaction: In text-based redaction, a text is put down to obscure PII and SPIs (Figure 2).
In the image, it is shown that confidential information is replaced with "DELETED" text.

Figure 2: Text-based redaction.6

  • Hybrid redaction: Image-based and text-based redaction is used for redacting confidential material.

What is automated redaction, and how does it work?

With automated redaction, special software finds PII and SPI in e-documents and redacts them. Automated redaction can be beneficial when managing batches of documents or images.

In automated redaction, users can import spreadsheets containing customer names, emails, and other PII or SPI to the software.

The automated redaction tool can use pattern matching to match imported information and its occurrences in the redacted document. After the match, the software can replace the target information with image-based or hybrid-based redaction according to the user’s choice (see Video 1).

Video 1: Automated redaction.

Why is it important?

Missed redactions 

Automated redaction can reduce the risk of missed redactions by using pattern matching and automatically redacting them.

Redactions not attempted, forgotten, or missed are called “missed redactions.” Missed redactions can occur during manual redaction due to human error. Automated redaction can reduce the risk of missed redaction. The redaction tool can scan through documents and use pattern matching to redact the document. 

Inefficient redactions

Automated redaction tools can be made to prevent inefficient redactions.

Redactions that are attempted but are shown to be ineffective are called “inefficient redactions.”

Inefficient redactions can be overridden. When redaction is not done well, the text of the redacted line can be copied and pasted to another page, or the redacted line’s content can be seen when the line is clicked.

Inefficient redaction can occur in manual redaction when the user leverages methods not provided by redaction software.

Sensitive information may leak as a result of ineffective redactions. The Mueller/Manafort Investigation is an example of inefficient redaction, and here is an example from the investigation document in the black box below.7 In the investigation document, the redacted lines can be overridden simply by selecting and copying the text inside with your cursor.8

Both missed and ineffective redactions can prevent sensitive information from being protected. This can result in the following:

  • Monetary lawsuits 
  • Penalties due to non-compliance with laws and regulations
  • Brand reputation damage

What are the benefits of automated redaction?

  • Increase compliance with data privacy laws and rules: Automated data redaction can help make sure privacy laws are followed. Because it can:
    • Reduce the risk of missed redactions: In manual redaction, the users can overlook sensitive information and miss redactions. An automated redaction tool can scan, detect, and automatically apply redaction. This can reduce the risk of missed redactions and human error
    • Reduce the risk of inefficient redactions: Automated redaction tools can be programmed to eliminate inefficient redactions. Redaction software is designed to prevent redacted lines from being copied when clicked. Redaction tools can eliminate inefficient redactions occurring due to human error.
  • Save time: Automated redaction can save time from the redaction process of batch e-documents. Users can import spreadsheets containing PII or SPIs into an automated redaction tool. Then, the software can use pattern matching to redact personal information automatically.
  • Save money: An average review attorney, for example, can charge between $100-$400.

What is role-based redaction?

Content services can offer role-based redaction services. Role-based redaction enables authorized users to hide and show redacted information. Role-based redaction can provide the same services as automated redaction.

However, in role-based redaction, authorized users can hide and show confidential information by clicking on redacted parts of the document or image. On the other hand, other users cannot update the status of redacted information and do not have access to it (see Video 2).

How does it work?

The authorized user can hide\show sensitive data in the redacted document by clicking the redaction button on the control panel.

The authorized user can determine the size of the redaction, i.e., the blackout box, simply by dragging the cursor over sensitive material. Then, the user can set a reason code for redaction like “social security number” or “credit card number.” 

Then, the authorized user can save the document and click the “toggle” button to toggle redactions on and off.

The toggle button is accessible only by authorized users. Once the redaction is saved, other users cannot see the redacted information, even if the authorized user toggles off the redaction(see Video 2).

Video 2: Role-based redaction.

What are the benefits of role-based redaction?

  • Increase flexibility: Role-based redaction can increase flexibility over redacted documents with user controls that enable authorized personnel to show and hide redacted information.
  • Reduce duplicate documents: Role-based redaction enables users to work on the same documents with different views. Authorized personnel can see confidential information, but unauthorized users cannot. This can enable employees and customers to work on original documents without creating extra copies. This can:
    • Save time in searching for documents
    • Enhance the organization of electronic documents by reducing duplicate documents
    • Reduce errors and confusion arising from storing duplicate documents

What is AI redaction?

The use of artificial intelligence (AI) to perform redaction tasks is known as “AI redaction.”

AI redaction can provide the services that automated and role-based redaction provide. In the case of AI redaction, artificial intelligence can identify PII and SPI that must be redacted. 

How does it work?

The AI redaction tool can identify names, emails, gender information, or other sensitive data and present it to the user in boxes when the document is seen. The user can click on boxes to redact sensitive information. 

Also, the AI redaction tool can offer suggestions and checks in AI redaction. Users can find out about sensitive information that has been redacted or has not been redacted by looking at a list of names and emails.

Video 3: AI redaction.

What are the benefits of AI redaction?

  • Enables unstructured content redaction: AI redaction can recognize sensitive information in unstructured content like scanned documents.
  • Saves time: AI redaction can recognize sensitive information without importing spreadsheets required for pattern matching. This can help save time during the redaction process.
  • Increases visibility: AI redaction lets users see redacted and unredacted information on a single page.
  • Provides easy-to-use interface: AI redaction provides an easy-to-use platform to conduct redactions, where users can click on marked texts.
  • Reduces errors: By reducing manual intervention compared to automated and role-based redaction, AI redaction decreases the risk of errors during redaction. 
  • Increase compliance with data privacy laws and regulations: AI redaction can increase compliance with data privacy laws and regulations by reducing errors.

If you have more questions about redaction, do not hesitate to get in touch with us at:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Cem Dilmegani
Principal Analyst

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

1 Comments
Zach long
Apr 26, 2023 at 00:10

Hi Cem,

Who uses this software today?

Bardia Eshghi
May 05, 2023 at 07:44

Hi Zach. Doesn’t our article answer your question?

Related research