In today's digital landscape, AI-generated content is becoming increasingly sophisticated and widespread. While this brings incredible creative possibilities, it also raises concerns about authenticity and trust. How can we embrace AI's potential while maintaining transparency about content origins?
C2PA (Coalition for Content Provenance and Authenticity) offers a solution. It's a metadata standard that major tech companies like Adobe and OpenAI have adopted to clearly identify AI-generated content.
In this guide, you will learn how to:
Below is an image generated by DALL-E through ChatGPT, verifying with https://contentcredentials.org/verify
This image includes C2PA metadata that tells us: - It was "Generated by OpenAI" - Which model created it (DALL-E) - A digital signature from OpenAI
This is a runnable notebook demonstrating C2PA with Python. We'll use fast-c2pa-python, our wrapper of c2pa-rs. The code snippets below are presented as they would appear in a Jupyter Notebook, including the output of each cell. Commands prefixed with ! are shell commands executed from within the notebook.
Note that the images displayed in this blog post have already had their C2PA metadata removed during processing, so you'll need to use your own images from ChatGPT, Adobe tools, or other C2PA-enabled applications to follow along with the examples.
Below is an image generated by ChatGPT
from fast_c2pa_python import read_c2pa_from_file
from PIL import Image
# Read C2PA metadata from our example image
metadata = read_c2pa_from_file("https://sightengine.com/assets/img/blog/c2pa/chatgpt_image.jpg")
# Get the active manifest which contains the main C2PA data
active_manifest_id = metadata["active_manifest"]
active_manifest = metadata["manifests"][active_manifest_id]
print("C2PA Metadata:")
print(f"- Validation State: {metadata['validation_state']}")
if "signature_info" in active_manifest:
print(f"- Signed by: {active_manifest['signature_info'].get('issuer', 'Unknown')}")
C2PA Metadata:
- Validation State: Valid
- Signed by: OpenAI
This image is C2PA-valid and signed by OpenAI, full metadata can be found below
{'active_manifest': 'urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb',
'manifests': {'urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067': {'claim_generator_info': [{'name': 'ChatGPT',
'org.cai.c2pa_rs': '0.49.5'}],
'title': 'image.png',
'instance_id': 'xmp:iid:db163776-cbae-4c3d-89b3-59b3ca3c1fba',
'ingredients': [],
'assertions': [{'label': 'c2pa.actions.v2',
'data': {'actions': [{'action': 'c2pa.created',
'softwareAgent': {'name': 'GPT-4o'},
'digitalSourceType': 'http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia'},
{'action': 'c2pa.converted',
'softwareAgent': {'name': 'OpenAI API'}}]}}],
'signature_info': {'alg': 'Es256',
'issuer': 'OpenAI',
'cert_serial_number': '631872854730012650133502748526736898092667640635'},
'label': 'urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067'},
'urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb': {'claim_generator_info': [{'name': 'ChatGPT',
'org.cai.c2pa_rs': '0.49.5'}],
'title': 'image.png',
'instance_id': 'xmp:iid:b3aff8fe-bbb9-4812-91c5-b7343a8dfd94',
'ingredients': [{'title': 'image.png',
'format': 'png',
'instance_id': 'xmp:iid:eefb9ea0-fe68-4e6a-a97a-a21e21f45b1f',
'thumbnail': {'format': 'image/jpeg',
'identifier': 'self#jumbf=c2pa.assertions/c2pa.thumbnail.ingredient.jpeg'},
'relationship': 'componentOf',
'active_manifest': 'urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067',
'validation_results': {'activeManifest': {'success': [{'code': 'claimSignature.insideValidity',
'url': 'self#jumbf=/c2pa/urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067/c2pa.signature',
'explanation': 'claim signature valid'},
{'code': 'claimSignature.validated',
'url': 'self#jumbf=/c2pa/urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067/c2pa.signature',
'explanation': 'claim signature valid'},
{'code': 'assertion.hashedURI.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067/c2pa.assertions/c2pa.actions.v2',
'explanation': 'hashed uri matched: self#jumbf=c2pa.assertions/c2pa.actions.v2'},
{'code': 'assertion.hashedURI.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067/c2pa.assertions/c2pa.hash.data',
'explanation': 'hashed uri matched: self#jumbf=c2pa.assertions/c2pa.hash.data'},
{'code': 'assertion.dataHash.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:f24e32e2-fbe2-4bf0-b31f-7b69ae159067/c2pa.assertions/c2pa.hash.data',
'explanation': 'data hash valid'}],
'informational': [],
'failure': []}},
'label': 'c2pa.ingredient.v3'}],
'assertions': [],
'signature_info': {'alg': 'Es256',
'issuer': 'OpenAI',
'cert_serial_number': '631872854730012650133502748526736898092667640635'},
'label': 'urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb'}},
'validation_results': {'activeManifest': {'success': [{'code': 'claimSignature.insideValidity',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.signature',
'explanation': 'claim signature valid'},
{'code': 'claimSignature.validated',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.signature',
'explanation': 'claim signature valid'},
{'code': 'assertion.hashedURI.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.thumbnail.ingredient.jpeg',
'explanation': 'hashed uri matched: self#jumbf=c2pa.assertions/c2pa.thumbnail.ingredient.jpeg'},
{'code': 'assertion.hashedURI.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.ingredient.v3',
'explanation': 'hashed uri matched: self#jumbf=c2pa.assertions/c2pa.ingredient.v3'},
{'code': 'assertion.hashedURI.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.hash.data',
'explanation': 'hashed uri matched: self#jumbf=c2pa.assertions/c2pa.hash.data'},
{'code': 'assertion.dataHash.match',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.hash.data',
'explanation': 'data hash valid'}],
'informational': [],
'failure': []}},
'validation_state': 'Valid'}
The JSON response shows:
C2PA uses multiple layers of verification:
Let's test C2PA's tampering detection by modifying an image.
Below are two versions of the same image:
We use fast-c2pa-python to convert the image while keeping its C2PA data intact. This allows us to demonstrate how C2PA detects even simple pixel modifications.
from fast_c2pa_python import convert_to_gray_keep_c2pa
# Convert image to grayscale while keeping C2PA data
input_image = "https://sightengine.com/assets/img/blog/c2pa/chatgpt_image.jpg"
output_image = "https://sightengine.com/assets/img/blog/c2pa/chatgpt_image_gray.jpg"
convert_to_gray_keep_c2pa(input_image, output_image, format="image/png")
# Display both images side by side
from IPython.display import HTML, display
display(HTML(f'''
<div style="display: flex; gap: 20px;">
<div>
<p>Original Image:</p>
<img src="{input_image}" width="300"/>
</div>
<div>
<p>Grayscale Image (with C2PA preserved):</p>
<img src="{output_image}" width="300"/>
</div>
</div>
'''))
Original Image:
Grayscale Image (with C2PA preserved):
# Verify C2PA data in the grayscale image
metadata = read_c2pa_from_file(output_image)
print("C2PA Validation State:", metadata["validation_state"])
C2PA Validation State: Invalid
metadata['validation_results']['activeManifest']['failure']
[{'code': 'assertion.dataHash.mismatch',
'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.hash.data',
'explanation': 'asset hash error, name: jumbf manifest, error: hash verification( Hashes do not match )'}]
As expected, C2PA validation fails because the image pixels were modified - the grayscale conversion changed the pixel values, causing the data hash to mismatch with the original hash stored in C2PA.
Now let's try a different type of tampering - modifying image metadata. We'll use exiftool to change the CreateDate field while keeping the image pixels intact. This tests if C2PA can detect metadata-only modifications.
# First create a copy of the image
!cp https://sightengine.com/assets/img/blog/c2pa/chatgpt_image.jpg https://sightengine.com/assets/img/blog/c2pa/chatgpt_image_createdate.jpg
# Then modify the CreateDate of the copy
!exiftool -CreateDate="2024:01:01 12:00:00" -overwrite_original https://sightengine.com/assets/img/blog/c2pa/chatgpt_image_createdate.jpg
1 image files updated
print("Original image CreateDate:")
!exiftool -CreateDate -s -s -s https://sightengine.com/assets/img/blog/c2pa/chatgpt_image.jpg
print("\n")
print("Modified image CreateDate:")
!exiftool -CreateDate -s -s -s https://sightengine.com/assets/img/blog/c2pa/chatgpt_image_createdate.jpg
Original image CreateDate:
Modified image CreateDate:
2024:01:01 12:00:00
The exiftool output shows we successfully changed the image's CreateDate from None to 2024:01:01 12:00:00. Let's see how C2PA validates this modified image.
metadata = read_c2pa_from_file("https://sightengine.com/assets/img/blog/c2pa/chatgpt_image_createdate.jpg")
print("C2PA Validation State:", metadata["validation_state"])
print("C2PA Validation Failures:", metadata['validation_results']['activeManifest']['failure'])
C2PA Validation State: Invalid
C2PA Validation Failures: [{'code': 'assertion.dataHash.mismatch', 'url': 'self#jumbf=/c2pa/urn:c2pa:35adfcd7-6ebe-463f-9829-310afd9cefcb/c2pa.assertions/c2pa.hash.data', 'explanation': 'asset hash error, name: jumbf manifest, error: hash verification( Hashes do not match )'}]
As expected, C2PA successfully detected the metadata tampering, marking the image as Invalid. Even though we only modified the CreateDate field, C2PA's integrity checks caught this change.
We've seen how C2PA detects tampering, but there's another crucial aspect: how do we verify that content comes from trusted providers like Adobe or OpenAI?
By default, C2PA SDKs only verify content integrity, not the trustworthiness of sources. To enable trust verification, we need to configure a list of trusted certificates.
Let's test how C2PA validates an image that has valid signatures but isn't from our trusted sources list. We'll use a test image without enabling trust verification.
For this test, we'll use a sample image from the c2pa-rs repository that contains valid C2PA data but isn't signed by a known provider.
metadata = read_c2pa_from_file("https://sightengine.com/assets/img/blog/c2pa/C.jpg")
print("C2PA Validation State:", metadata["validation_state"])
print("C2PA Validation Failures:", metadata['validation_results']['activeManifest']['failure'])
C2PA Validation State: Valid
C2PA Validation Failures: []
The image passes all integrity checks with a Valid status. Looking at the manifest, we can see it's signed by "C2PA Test Signing Cert" - indicating it's a test certificate, not a real provider.
metadata['manifests']['contentauth:urn:uuid:b2b1f7fa-b119-4de1-9c0d-c97fbea3f2c3']['signature_info']
{'alg': 'Ps256',
'issuer': 'C2PA Test Signing Cert',
'cert_serial_number': '720724073027128164015125666832722375746636448153',
'time': '2024-08-06T21:53:37+00:00'}
CAI provides a list of trusted certificates that we can configure in our C2PA settings (documentation).
This list includes certificates from major providers such as:
from fast_c2pa_python import read_c2pa_from_file, setup_trust_verification
# Setup trust verification
setup_trust_verification(
"tests/tmp_cert/anchors.pem", # Root certificates
"tests/tmp_cert/allowed.pem", # Allowed certificates
"tests/tmp_cert/store.cfg" # Trust configuration
)
# Read with trust list
metadata = read_c2pa_from_file("https://sightengine.com/assets/img/blog/c2pa/C.jpg")
print("C2PA Validation State:", metadata["validation_state"])
print("C2PA Validation Failures:", metadata['validation_results']['activeManifest']['failure'])
C2PA Validation State: Invalid
C2PA Validation Failures: [{'code': 'signingCredential.untrusted', 'url': 'self#jumbf=/c2pa/contentauth:urn:uuid:b2b1f7fa-b119-4de1-9c0d-c97fbea3f2c3', 'explanation': 'signing certificate untrusted'}]
Now the same image fails validation with status Invalid. The error code signingCredential.untrusted indicates that while the signature is valid, the issuer is not in our list of trusted providers.
C2PA is a powerful tool for building trust in digital content, but its effectiveness faces two significant challenges. First is adoption - the system requires widespread implementation across digital services, content creators, and platforms to maintain the chain of trust.
The second challenge is metadata preservation. Currently, C2PA data is easily lost through common actions like taking screenshots or sharing on social media platforms. Most image processing operations strip this crucial metadata, breaking the verification chain.
These limitations highlight that while C2PA provides robust technical solutions for content authenticity, its success depends heavily on ecosystem-wide support and improved metadata resilience.
This tutorial uses several open source tools and references:
This is a guide to detecting, moderating and handling self-harm, self-injury and suicide-related topics in texts and images.
This is a step to further enhance end-user safety in the online dating realm.