Cimplifi™ Earns Prestigious Relativity Security Best Practices Competency

SEARCH BY:
Blog  |  July 17, 2025

Taming Modern Data Challenges: Linked Documents

In our last post, we discussed considerations of conversational ESI in eDiscovery, including how to  define a “conversation”, the lack of clear standards for doing so, and strategies for preserving and producing conversational-based ESI in discovery.

One of the biggest and most discussed modern data challenges that legal and eDiscovery professionals face today is discovery of linked documents. Documents that were historically sent as physical attachments to an email or other communication are now more frequently linked from another source, creating an enormous challenge in preserving and collecting that ESI. Opinions vary widely regarding treatment of these files in discovery and even what they should be called.

A common question we hear is: Do I need to collect linked documents? And if so, do I need to show how they relate to the parent message? The short answer is: it depends. If you don’t have to go down that path, don’t. It’s technically complex, and once you start linking documents to parent messages, you’ll face a host of decisions—each with trade-offs. In light of evolving case law on this issue, there are multiple ways to establish those relationships.

Given how this topic has captivated the legal and eDiscovery communities, we have four posts on linked documents. In this post, we will discuss the reason that linked documents present such a challenge, the dispute over their nomenclature, and the historical collection challenges associated with them. In the second post, we’ll discuss the preservation challenges associated with linked documents and best practices for addressing them in eDiscovery. In the third post, we’ll discuss technological developments and updated approaches for handling linked files. In the final post, we’ll discuss recent case law rulings regarding the treatment of linked documents in discovery.

The Challenge of Linked Documents

The ability to hyperlink to content – such as web pages, folders and documents – from within an email has basically always existed. So, why is this a “modern data” challenge? Because of the shift to the cloud, it has become common to link to documents in cloud-based data sources instead of embedding them within an email or communication, which had previously been the standard. When attachments are embedded within the email, a snapshot of the attached documents is saved when the email is sent, thereby preserving the attachment along with the email communication. It became standard in eDiscovery protocol to treat the emails and the physical attachments as a “family group”, with the email as the parent and the attachments as the children.

Linking to documents instead of embedding them in communications is great from a records management and information governance perspective because it reduces data redundancy. You no longer have multiple copies of the same document in different recipients’ inboxes, which improves data hygiene. But it can create problems from an eDiscovery perspective because you don’t have a snapshot of the attached document anymore. When collecting ESI for discovery, that document would have to be separately collected via the link. That document may have been modified or deleted since the link was created – making it difficult to impossible to get to the version of the actual document sent, or any version at all.

The Dispute Over Nomenclature: What’s In a Name?

Within the legal industry, there is disagreement on the naming convention for the linked documents to an email communication. Some people refer to them as simply “hyperlinks”, while others call them “linked files” or “hyperlinked files”. The term that tends to cause the most controversy is “modern attachments” (or “cloud attachments”, which is what Microsoft calls them) – this term connoting a parent/child relationship akin to the previous attached snapshots.  Some argue that these documents should be treated just like traditional attachments and that there should be a requirement to produce them in discovery as part of a family group. While others argue that they should not be treated as old school attachments nor be produced with the communication, or even at all.

At Cimplifi, we prefer to use the term “linked documents” as this term limits the scope to the version of the document linked within a communication. As cloud-based email communication tools have become common in organizations, there is often a need to collect emails and their linked documents for production in discovery. Keep in mind, however, that the courts and the industry have not settled on a standardized term, so expect to continue to see a variety of appellations used to describe linked documents.

Historical Collection Challenges with Linked Documents

The dynamic nature of linked documents and the fact that they are stored apart from the communication itself creates considerable challenges in discovery. Microsoft 365 and Google Workspace are the most common platforms from which linked documents are subject to collection, but this challenge can extend to literally any communication platform that supports the ability to link to documents – including collaboration solutions like Slack and Teams. Here are some of the most common historical challenges for collecting linked documents:

Version Control

Linked documents can be edited or deleted after sharing, making it difficult to capture the exact version referenced at the time of communication. For instance, Google Vault may only provide the ability to export the current version of a document, not the version as it existed when the email was sent.

M365 provides the ability to export the version of the document that was originally sent, but only for organizations that have invested in an eDiscovery (Premium) (known as E5) license. Even with E5, the organization must have created a Microsoft Purview retention label beforehand to apply the label to “cloud attachments” to ensure that a copy of a document is created at the time when it’s shared.

Access and Permissions

Linked documents may reside in various locations with different access controls, such as individual OneDrive accounts or shared SharePoint libraries, complicating the collection process. If access to those documents isn’t granted or is later revoked, collection can be challenging. Indeed, if the sender of that document is in another organization that is not a party to the litigation, it may be necessary to serve a third-party subpoena to collect that data.

Ubiquity of Communication Platforms

While M365 and Google Workspace have been the most common platforms involving collection of linked documents (and the primary platforms for which case law rulings currently exist), the rise in popularity of collaboration solutions like Slack, Teams, WhatsApp, Snapchat, Signal, and many other chat apps within organizations increases the potential need to collect linked documents from these platforms as well.

Conclusion

Linked documents are one of the biggest modern data challenges that require taming today. In fact, there are so many challenges associated with linked documents, it takes more than one post to discuss them all. It’s not just what to call them and how to collect them – there are preservation challenges as well.

In our next post in the series, we will continue our discussion of linked documents with a discussion of the historical preservation challenges associated with linked documents and best practices for addressing them!

For more regarding Cimplifi forensics & collections capabilities, click here.

>