metadata syntax

This commit is contained in:
anarsec 2023-08-17 19:37:48 +00:00
parent 443fd9ac28
commit 0442404847
No known key found for this signature in database
2 changed files with 13 additions and 13 deletions

View file

@ -1,5 +1,5 @@
+++
title="Removing Identifying Metadata From Files"
title="Remove Identifying Metadata From Files"
date=2023-04-03
[taxonomies]
@ -15,37 +15,37 @@ letter="metadata-letter.pdf"
+++
[Metadata](/glossary/#metadata) is 'data about data', or 'information about information'. In the context of files, this can mean information that is automatically embedded in the file, and this information can be used to deanonymize you. For example, an image file will often have metadata about when it was taken, where, on which camera, etc. A PDF file may have information about which program created it, on which computer, etc. This can be used by investigators to link a photo to the camera it was taken on, link a video to the computer it was edited on, and so on. To learn more about how metadata can be used to identify and reveal personal information, see [Behind the Data: Investigating metadata](https://exposingtheinvisible.org/en/guides/behind-the-data-metadata-investigations/). Before putting a sensitive file onto the Internet, clean the metadata from it.
[Metadata](/glossary/#metadata) is 'data about data' or 'information about information'. In the context of files, this can mean information that is automatically embedded in the file, and this information can be used to deanonymize you. For example, an image file will often have metadata about when it was taken, where it was taken, what camera it was taken with, and so on. A PDF file may have information about what program created it, what computer, etc. This can be used by investigators to link a photo to the camera on which it was taken, a video to the computer on which it was edited, and so on. To learn more about how metadata can be used to identify and reveal personal information, see [Behind the Data: Investigating metadata](https://exposingtheinvisible.org/en/guides/behind-the-data-metadata-investigations/). Before you put a sensitive file on the Internet, cleanse it of metadata.
<!-- more -->
# Metadata Anonymization Toolkit
Thankfully, there is a tool that comprehensively cleans metadata, and it is available as both a [command line interface](/glossary#command-line-interface-cli) and a graphical user interface. The command line version is called `mat2` and is [open-source](https://0xacab.org/jvoisin/mat2), and the graphical version is called [Metadata Cleaner](https://metadatacleaner.romainvigier.fr/) and is also [open-source](https://gitlab.com/rmnvgr/metadata-cleaner/). Both programs are included in [Tails](/tags/tails/) and [Qubes-Whonix](/posts/qubes/#whonix-and-tor) by default.
Fortunately, there is a tool that comprehensively cleans metadata, and it is available as both a [command line interface](/glossary#command-line-interface-cli) and a graphical user interface. The command line version is called `mat2` and is [open-source](https://0xacab.org/jvoisin/mat2), and the graphical version is called [Metadata Cleaner](https://metadatacleaner.romainvigier.fr/) and is also [open-source](https://gitlab.com/rmnvgr/metadata-cleaner/). Both programs are included in [Tails](/tags/tails/) and [Qubes-Whonix](/posts/qubes/#whonix-and-tor) by default.
# Using Metadata Cleaner
# Using the Metadata Cleaner
Unless you are comfortable with the command line, we recommend Metadata Cleaner - it is using `mat2` under the hood, so has all of the same functionality. Metadata Cleaner is better than Exiftool and other software that removes metadata - see the [comparison docs](https://0xacab.org/jvoisin/mat2/-/blob/master/doc/comparison_to_others.md).
If you are not comfortable with the command line, we recommend using Metadata Cleaner - it uses `mat2` under the hood, so it has all the same functionality. Metadata Cleaner is better than Exiftool and other metadata removal software - see the [comparison docs](https://0xacab.org/jvoisin/mat2/-/blob/master/doc/comparison_to_others.md).
Metadata Cleaner displays metadata that it detects, but "it doesn't mean that a file is clean from any metadata if mat2 doesn't show any. There is no reliable way to detect every single possible metadata for complex file formats." You should clean the file even if no metadata is displayed.
Metadata Cleaner shows the metadata it detects, but "it doesn't mean that a file is clean from any metadata if mat2 doesn't show any. There is no reliable way to detect every single possible metadata for complex file formats." You should clean the file even if no metadata is displayed.
To use Metadata Cleaner, first add a file. If you click on it, the current metadata will be displayed. Highlight the file, then select **Clean**, followed by **Save**. You can double-check that the metadata was removed by re-adding the cleaned file and displaying its metadata.
To use the Metadata Cleaner, first add a file. When you click it, the current metadata is displayed. Select the file, then select **Clean**, followed by **Save**. You can verify that the metadata has been removed by re-adding the cleaned file and viewing its metadata.
Cleaning a PDF file will transform it into images, which makes it no longer possible to select the text in it. If you would like to retain this ability, there is a *lightweight* cleaning mode, which only cleans the superficial metadata of your file but not the metadata of embedded resources (such as of images in the PDF). Embedded resources having metadata can be avoided using Metadata Cleaner on the images before importing them to the layout software, and using layout software on Tails or Qubes-Whonix like Scribus which will be generic to those operating systems. You can enable "Lightweight mode" in the settings of Metadata Cleaner.
When you clean a PDF file, it is converted to images, so you cannot select the text in it. If you want to retain this ability, there is a *lightweight* cleaning mode that cleans only the superficial metadata of your file, but not the metadata of embedded resources (such as images in the PDF). Embedded resources with metadata can be avoided by using Metadata Cleaner on the images before importing them into the layout software, and by using layout software on Tails or Qubes-Whonix such as Scribus that are generic for those operating systems. You can enable "lightweight mode" in the Metadata Cleaner settings.
Keep in mind the limitations of Metadata Cleaner: "mat2 only removes metadata from your files, it does not anonymise their content, nor can it handle watermarking, steganography, or any too custom metadata field/system. If you really want to be anonymous, use file formats that do not contain any metadata, or better: use plain-text."
Note the limitations of Metadata Cleaner: "mat2 only removes metadata from your files, it does not anonymise their content, nor can it handle watermarking, steganography, or any too custom metadata field/system. If you really want to be anonymous, use file formats that do not contain any metadata, or better: use plain-text."
# Photo and Video Forensics
Even though it is possible to clean all metadata from an image or video, forensic examination may nonetheless determine which device was used to capture it. As the Whonix [docs](https://www.whonix.org/wiki/Surfing_Posting_Blogging#Photographs) note:
While it is possible to remove all metadata from an image or video, forensic examination may still reveal what device was used to capture it. As the Whonix [docs](https://www.whonix.org/wiki/Surfing_Posting_Blogging#Photographs) note:
> Every camera's sensor has a unique noise signature because of subtle hardware differences. The sensor noise is detectable in the pixels of every image and video shot with the camera and could be fingerprinted. In the same way ballistics forensics can trace a bullet to the barrel it came from, the same can be accomplished with adversarial digital forensics for all images and videos. Note this effect is different from file metadata.
Multiple photos or videos from the same camera can be tied together in this way, and if the camera is recovered it can be confirmed to be where the files came from. Cheap cameras can be acquired from a refurbished store and used only once for images or videos that require high security.
Multiple photos or videos from the same camera can be tied together in this way, and if the camera is recovered, it can be confirmed where the files came from. Cheap cameras can be purchased from a refurbished store and used only once for pictures or videos that require high security.
# Printer Forensics
All modern printers leave invisible watermarks in order to encode information such as the serial number of the printer and and when it was printed. If printed material is scanned, these markings are present in the file. To learn more, see [Revealing Traces in printouts and scans](https://dys2p.com/en/2022-09-print-scan-traces.html) and the Whonix documentation on [printing and scanning](https://www.whonix.org/wiki/Printing_and_Scanning).
All modern printers leave invisible watermarks to encode information such as the serial number of the printer and when it was printed. When printed material is scanned, these marks are present in the file. To learn more, see [Revealing Traces in Printouts and Scans](https://dys2p.com/en/2022-09-print-scan-traces.html) and the Whonix documentation on [printing and scanning](https://www.whonix.org/wiki/Printing_and_Scanning).
# Further Reading

View file

@ -37,7 +37,7 @@ Let's start by looking at the [Tails Warnings page](https://tails.boum.org/doc/a
You can mitigate this first issue by **cleaning metadata from files before sharing them**:
* To learn how, see [Removing Identifying Metadata From Files](/posts/metadata/).
* To learn how, see [Remove Identifying Metadata From Files](/posts/metadata/).
### 2. Using Tails for more than one purpose at a time