# Collection structure¶

To let you share your messages, Text Collector creates a zip of everything in a collection. When you share the collection, the output file will be called Messages.zip, or, on Google Drive, Text Message Collection.

Inside the zip, you’ll find one or more PDF files and, sometimes, folders with attachments.

Download a sample collection

## Information document¶

The information document, called 0. Information.pdf, summarizes what is in a collection. Its content looks something like this:

 Collected from Bob (540) 555-0101, bob@example.com Owner’s name and contact information found on the phone Collected on Apr 11, 2017 in Eastern Time, United States When and where the messages were collected. Messages reflect local time in this time zone. Filter: Messages from Apr 6, 2017 to Apr 10, 2017 What date filter, if any was applied. 77 messages found First message on Apr 6, 2017 Last message on Apr 10, 2017 First and last message can help identify when the phone was in use. If my date filter started in 2015, but the first message was in 2016, it’s a good bet that I bought the phone in 2016.

After these, there are technical details: the device and the software versions running, which can can be useful if a particular device or Android version is known to cause problems. Finally, there is a disclaimer to remind you I’m not responsible for the consequences of using this software.

## Messages¶

Each conversation gets a separate document. “Conversation” means all messages to or from a given person or phone number. So, in this example, “Xavier” and “Xena” are two different people, but “262966” is a phone number that doesn’t have an entry in my phone’s address book:

Text Collector version 1.1 onward sorts conversations in this order:

1. Named people, alphabetically
2. Phone numbers without names
3. Messages to yourself
4. Messages to or from unknown numbers

When a person has more than one phone number, Text Collector displays only the first phone number, alphabetically, in the name of the PDF. It prefixes each file with a number so that you can tell which is which when two people have the same name and number. This is unusual, but not impossible:

If these really were the same person, you could unify them into one contact (in your phone’s address book), but you must do so before collecting the messages. Beware that you should always ask your attorney before changing data that might be used in litigation.

## Attachments¶

Two of the people in this collection have not just ordinary messages, but also attachments.

Note

For ediscovery professionals

If you process Text Collector output through a tool with deduplication capability, you should not deduplicate it. Text Collector may intentionally duplicate attachments, so the processed data can appear incomplete after deduplication.

Any conversation with attachments gets a folder whose name matches the PDF:

In this case, the attachments are videos. Within these folders, each attachment is numbered for cross-reference with the PDF:

PDF content Folder content
Notice label [Attachment 1] Filename starts with number one

Thus, attachment numbers in the PDF match numbers on attachment files.