# Collection structure¶

To let you export your messages, Text Collector creates a zip of either everything in a collection or a single conversation, depending on what you choose to share. Text Collector When you share the collection, the output file will be called Messages [timestamp].zip. The timestamp is in UTC time so that collections can easily be sorted by when they were created.

Download a sample collection

Note

Text Collector zips the export even when zipping isn’t strictly necessary: when you choose to share only the one conversation, that is. This is good for legal discovery, but if you’re looking for a quick way to archive text messages, it might be inconvenient. To avoid zipping single conversations, you can turn off the “zip always” option in settings.

If the shared conversation has attachments, however, Text Collector always zips it.

Inside the zip, you’ll find one or more PDF files and, sometimes, folders with attachments.

## Information document¶

The information document, called 0. Information.pdf, summarizes what is in a collection. Its content looks something like this:

 Collected from Bob (540) 555-0101, bob@example.com Owner’s name and contact information found on the phone Collected on Apr 11, 2017 in Eastern Time, United States When and where the messages were collected. Messages reflect local time in this time zone. Date filter: Messages from Apr 6, 2017 to Apr 10, 2017 What date filter, if any was applied. Conversation: All conversations Who the other party was on the exported conversation, if only one conversation was exported 77 messages found First message on Apr 6, 2017 Last message on Apr 10, 2017 First and last message can help identify when the phone was in use. If my date filter started in 2015, but the first message was in 2016, it’s a good bet that I bought the phone in 2016.

After these, there are technical details: the device and the software versions running, which can can be useful if a particular device or Android version is known to cause problems. Finally, there is a disclaimer to remind you I’m not responsible for the consequences of using this software.

## Messages¶

Each conversation gets a separate document. “Conversation” means all messages to or from a given person or phone number. So, in this example, “Xavier” and “Xena” are two different people, but “262966” is a phone number that doesn’t have an entry in my phone’s address book:

Text Collector version 1.1 onward sorts conversations in this order:

1. Named people, alphabetically
2. Phone numbers without names
3. Messages to yourself
4. Messages to or from unknown numbers

When a person has more than one phone number, Text Collector displays only the first phone number, alphabetically, in the name of the PDF. It prefixes each file with a number so that you can tell which is which when two people have the same name and number. This is unusual, but not impossible:

If these really were the same person, you could unify them into one contact (in your phone’s address book), but you must do so before collecting the messages. Beware that you should always ask your attorney before changing data that might be used in litigation.

## Attachments¶

Two of the people in this collection have not just ordinary messages, but also attachments. Attachments include files that can’t be easily displayed in PDF, such as videos.

Note

For ediscovery professionals

If you process Text Collector output through a tool with deduplication capability, you should not deduplicate it. Text Collector can intentionally duplicate attachments, and deduplication would make the data look incomplete.

Any conversation with attachments gets a folder whose name matches the PDF:

Within these folders, each attachment is numbered for cross-reference with the PDF:

PDF content Folder content
Notice label [Attachment 1] Filename starts with number one

Thus, attachment numbers in the PDF match numbers on attachment files.