Decoding the information dilemma: Strategies for effective data deletion within the age of AI

March 17, 2024

94

Companies today have an amazing opportunity to make use of data in recent ways. However, in addition they need to examine what data they store and the way they use it to avoid potential legal issues. Despite the expansion of generative AI, corporations are responsible not only for shielding their data, especially personal data, but additionally for strategically managing and deleting older information that poses more risk than business value.

Forrester predicts a Duplication of unstructured data in 2024, powered partially by AI. But the evolving data landscape and the increasing costs of security breaches and data breaches require a critical examination of the way to create an efficient and robust data retention and deletion strategy.

Data explosion and escalating security breach costs

As expected data volumes grow, so do the prices of knowledge breaches and data breaches. Among other things, ransomware criminals take over highly sensitive medical and government databases Hacks the Australian courts, a Kentucky healthcare company, 23andMe and major corporations similar to Infosys, Boeing and security provider Okta. These breaches are also becoming costlier – IBM has found that the typical total cost of a breach is $4.45 million in 2023 – a rise of 15% in comparison with 2020.

To manage data effectively, corporations must create a policy to delete outdated data. With genetic AI, executives may ponder whether anything should ever be deleted in future opportunities. But the longer an organization stores data, the greater the possibilities of knowledge breaches or fines for violating data protection law. The first step to minimizing this risk is to take a comprehensive have a look at how an organization uses its data, in addition to the nuanced considerations and specific advantages of an information retention strategy.

Why remove stale data?

Due to legal requirements which are central to data protection, corporations are sometimes forced to delete outdated data. Legal regulations require personal data to be retained for under so long as essential, leading corporations to adopt retention policies with various time periods depending on their business area. Deleting outdated data not only reduces legal liability, but can even reduce storage costs.

Identifying stale data

The best method to work out what data will be considered stale and what data creates lasting business value is to start out with an information map that describes the sources and kinds of incoming data, what fields are included, and from which systems or servers the information is stored on. A comprehensive data map ensures that an organization knows where personal data is stored, what kinds of personal data are processed, what kinds of protected or special category data are processed, what data processing purposes are intended, and the geographical locations of the processing and the applicable systems have.

Meaningful data inventory and classification is the inspiration of a solid data protection program and helps provide the information lineage essential to know the flow of knowledge through a corporation's systems.

Once an organization has a map of its data assets, legal and engineering teams can work with company stakeholders to find out how useful certain data is likely to be, what regulatory restrictions apply to storing that data, and the potential consequences of losing that data could have. injured or kept longer than essential.

Most business stakeholders will naturally shrink back from deleting anything, especially when technology is changing so quickly. The conversation around deletion and retention must deal with what’s most useful to the business. For example, imagine an information analytics team at a financial institution that desires to be sure that lending models are trained on as much data as possible. Unfortunately, this approach contradicts the intent of privacy laws.

The reality is that data from 20 years ago may not provide an accurate assessment of today's consumers, given the most important changes in rates of interest, lending practices and consumers' individual circumstances. This company could also be higher off specializing in other sources of current data, similar to updated credit information, to find out an accurate risk assessment.

The current business real estate market highlights this challenge clearly. Many risk prediction models were trained on pre-pandemic data before the systemic shift to online shopping and distant work. To reduce the change in inaccurate predictions, consult with business stakeholders how data becomes stale and fewer useful over time and what data best reflects today's world.

Handling outdated data: discover, delete or anonymize

To resolve how long to retain data, start with positive legal obligations to take care of financial records or industry regulations regarding transactions that involve personal data. Look at legal statutes of limitations to find out how long data should be retained whether it is needed to defend against potential litigation, and only retain personal information essential for potential legal defense, similar to: B. transaction logs or evidence of user consent, and never all data about individual users.

When it comes time to delete less useful information, data will be manually deleted based on the retention period defined within the retention plan for every data type. Automating the method via a cleanup policy improves reliability. It can be possible to make use of an anonymization process to remove identifiable personal data or to make use of fully anonymized data, but this presents recent challenges.

Truly deidentified data generally falls throughout the exemptions of privacy laws, but to do that properly requires removing a lot value that there isn't much left. Deidentification requires removing unique and direct identifiers similar to SSN and name, but additionally indirect identifiers, including information similar to customer IP addresses. For example, to comply with the HIPAA Safe Harbor standard, a corporation must remove a listing of 18 identifiers. An organization should want to do this approach to take care of the performance of an analytics or AI model. However, it will be significant to debate the professionals and cons with stakeholders first.

Avoid common pitfalls

The biggest mistake corporations make when coping with stale data is rushing the method and skipping these in-depth conversations. Project owners must resist the urge to speed up and recognize that proper feedback from multiple groups is critical. Companies should work with legal, privacy and security teams, and business leaders to get feedback on what data absolutely must be retained – and avoid a retention policy and retention plan that by chance deletes something the corporate needs. It's easier to shorten retention periods over time and keep less personal information, but once it's gone, it's gone, so measure twice and cut once.

As we outlined above, there are several considerations when coping with stale data, including basic data mapping and lineage, defining retention period criteria, and developing an efficient implementation of those policies. Dealing with the intricacies of knowledge deletion requires a strategic and informed approach. By understanding the legal, cybersecurity and financial implications, corporations can develop a sturdy data retention strategy that not only complies with regulations but additionally effectively protects their digital assets.

Decoding the information dilemma: Strategies for effective data deletion within the age of AI

Data explosion and escalating security breach costs

Why remove stale data?

Identifying stale data

Handling outdated data: discover, delete or anonymize

Avoid common pitfalls

LEAVE A REPLY Cancel reply

Must Read

Trend reversal in technology stocks pushes US megacaps into correction zone

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

How Salesforce's STEM 1T dataset could revolutionize the AI industry

Forget coding bootcamps: Airtable's AI can construct your app in seconds

Level AI applies algorithms to the weak points within the contact center

ChatGPT: Everything you have to know concerning the AI-powered chatbot

Latest articles

Trend reversal in technology stocks pushes US megacaps into correction zone

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

Our Newsletter

Decoding the information dilemma: Strategies for effective data deletion within the age of AI

Data explosion and escalating security breach costs

Why remove stale data?

Identifying stale data

Handling outdated data: discover, delete or anonymize

Avoid common pitfalls

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter