Guides
Understand
How Climatiq Handles Data Quality

How Climatiq handles data quality

While the emission factors provided by Climatiq are calculated by government agencies and top climate scientists, the occasional error is still detected by our science and data team. That means that some emission factors these bodies publish are inaccurate, wrong, or problematic in some way.

When Climatiq, or its users (that's you!) notice these mistakes we take one of several actions:

  • If the emission factor is wrong enough to be unusable we often decide to not include it at all.
  • If we deem the emission factor isn't unusable or misleading, we include it in the API, but describe any issues in the data_quality_flags parameter that is returned in a variety of endpoints. This parameter returns a list of data quality flags. In addition to any flags, details of the issue will be included in the description field of the emission factor, which you can retrieve when searching emission factors.
  • The decision between these two approaches is made by carefully weighing up the potential impact of the application of an erroneous factor, the importance of adhering to the source data provided, and an assessment of how best to inform users of the issue.

An example of a response after performing an estimate with an emission factor that has data quality issues could look like this:

{
"co2e": 1.1082,
"co2e_unit": "kg",
"co2e_calculation_method": "ar5",
"co2e_calculation_origin": "source",
"emission_factor": {
"name": "Electricity supplied from grid",
"activity_id": "electricity-supply_grid-source_supplier_mix",
"id": "8dcd59f9-8193-4b0c-93b2-0115949b9629",
"access_type": "public",
"source": "GHG Protocol",
"source_dataset": "GHG Emissions Calculation Tool",
"year": 2021,
"region": "CN-NE",
"category": "Electricity",
"source_lca_activity": "electricity_generation",
// This list is not empty! That means there are data quality issues with this emission factor
"data_quality_flags": ["notable_methodological_variance"]
},
"constituent_gases": {
"co2e_total": 1.1082,
"co2e_other": null,
"co2": 1.1082,
"ch4": 0.0,
"n2o": 0.0
},
"activity_data": {
"activity_value": 1.0,
"activity_unit": "kWh"
},
"audit_trail": "selector"
}

The data_quality_flags attribute describes that there's something you should be mindful of when using this emission factor. If data_quality_flags is empty, it means that Climatiq has not detected any issues with the emission factor.

Querying with Data Quality Flags

You can specify which data quality flags are acceptable for your use-case, via the allowed_data_quality_flags parameter. Most endpoints accept a list of data quality flags. Any emission factor that contains data quality flags not in the list you have provided, will not be used.

E.g. if you provide "allowed_data_quality_flags": ["erroneous_calculation", "partial_factor"] in the /estimate endpoint, an emission factor with partial_factor could be chosen, as partial_factor is in the allowed list. However, an emission factor with both ["notable_methodological_variance", "erroneous_calculation"] would not be, as notable_methodological_variance is not in the list of allowed data quality flags.

Data Quality Flags

The table below shows the different data quality flags, and whether endpoints allow their use by default or not.

FlagDescriptionAllowed by default
notable_methodological_varianceWe have detected potential issues in the methodology (the method used to calculate an emission factor) behind an emission factor.
This could be because there is a generally recognized methodology for similar emission factors, and this one is different, or because the source is unclear about which methodology they use.
partial_factorThe co2e value of this factor does not take into account all gases emitted from an activity; for example it may only take into account CO2 emissions from an activity and not other greenhouse gases. It has been included because the factor or source is considered important enough to make available. See the description of the factor for more information, and refer to the source for more details.
self_reportedThis data quality flag indicates that the entity cited in the ´source´ field is both the reporter of the emission factor and the producer of the emissions from which it was derived. It has not necessarily been vetted or produced by an acknowledged independent organization.
suspicious_homogeneityThis data quality flag is given when multiple factors have a suspicious level of homogeneity (meaning there are identical CO2e values over multiple different activities). This implies that the underlying model did not have an appropriate level of granularity to create these factors and that they should be used with caution.
erroneous_calculationWe have detected an error in how the source calculated the emission factor, however we do not deem that it will skew the results in a major way, and have judged that adherence to the source data makes it important enough to include.

If you need more tools to work with data quality, or you've found an emission factor that seems off, we'd love to hear from you (opens in a new tab).