Skip to main content
Version: Next

Custom Assertions

This guide specifically covers how to create and report results for custom assertions in DataHub. Custom Assertions are those not natively run or directly modeled by DataHub, and managed by a 3rd party framework or tool.

To create native assertions using the API (e.g. for DataHub to manage), please refer to the Assertions API.

This guide may be used as reference for partners seeking to integrate their own monitoring tools with DataHub.

Goal Of This Guide

In this guide, you will learn how to

  1. Create and update custom assertions via GraphQL and Python APIs
  2. Report results for custom assertions via GraphQL and Python APIs
  3. Retrieve results for custom assertions via GraphQL and Python APIs
  4. Delete custom assertions via GraphQL and Python APIs

Prerequisites

The actor making API calls must have the Edit Assertions and Edit Monitors privileges for the Tables being monitored.

Create And Update Custom Assertions

You may create custom assertions using the following APIs for a Dataset in DataHub.

To create a new assertion, use the upsertCustomAssertion GraphQL Mutation. This mutation both allows you to create and update a given assertion.

mutation upsertCustomAssertion {
upsertCustomAssertion(
urn: "urn:li:assertion:my-custom-assertion-id", # Optional: if you want to provide a custom id. If not, one will be generated for you.
input: {
entityUrn: "<urn of entity being monitored>",
type: "My Custom Category", # This is how your assertion will appear categorized in DataHub.
description: "The description of my external assertion for my dataset",
platform: {
urn: "urn:li:dataPlatform:great-expectations", # OR you can provide name: "My Custom Platform" if you do not have an URN for the platform.
}
fieldPath: "field_foo", # Optional: if you want to associated with a specific field,
externalUrl: "https://my-monitoring-tool.com/result-for-this-assertion" # Optional: if you want to provide a link to the monitoring tool
# Optional: If you want to provide a custom SQL query for the assertion. This will be rendered as a query in the UI.
# logic: "SELECT * FROM X WHERE Y"
}
) {
urn
}
}

Note that you can either provide a unique urn for the assertion, which will be used to generate the corresponding assertion urn in the following format:

urn:li:assertion:<your-new-assertion-id>

or a random urn will be created and returned for you. This id should be stable over time and unique for each assertion.

The upsert API will return the unique identifier (URN) for the the assertion if you were successful:

{
"data": {
"upsertExternalAssertion": {
"urn": "urn:li:assertion:your-new-assertion-id"
}
},
"extensions": {}
}

Report Results For Custom Assertions

When an assertion is evaluated against a Dataset, or a new result is available, you can report the result to DataHub using the following APIs.

Once reported, these will appear in the evaluation history of the assertion and will be used to determine whether the assertion is displayed as passing or failing in the DataHub UI.

To report results for a custom, use the reportAssertionResult GraphQL Mutation. This mutation both allows you to create and update a given assertion.

mutation reportAssertionResult {
reportAssertionResult(
urn: "urn:li:assertion:<your-new-assertion-id>"
result: {
timestampMillis: 1620000000, # Unix timestamp in millis. If not provided, the current time will be used.
type: SUCCESS, # or FAILURE or ERROR or INIT
properties: [
{
key: "my_custom_key",
value: "my_custom_value"
}
],
externalUrl: "https://my-great-expectations.com/results/1234", # Optional: URL to the results in the external tool
# Optional: If the type is ERROR, you can provide additional context. See full list of error types below.
# error: {
# type: UNKNOWN_ERROR,
# message: "The assertion failed due to an unknown error"
# }
}
)
}

The type field is used to communicate the latest health status of the assertion.

The properties field is used to provide additional key-value pair context that will be displayed alongside the result in DataHub's UI.

The full list of supported error types include:

  • SOURCE_CONNECTION_ERROR
  • SOURCE_QUERY_FAILED
  • INSUFFICIENT_DATA
  • INVALID_PARAMETERS
  • INVALID_SOURCE_TYPE
  • UNSUPPORTED_PLATFORM
  • CUSTOM_SQL_ERROR
  • FIELD_ASSERTION_ERROR
  • UNKNOWN_ERROR
{
"data": {
"reportAssertionResult": true
},
"extensions": {}
}

If the result is true, the result was successfully reported.

Retrieve Results For Custom Assertions

After an assertion has been created and run, it will appear in the set of assertions associated with a given dataset urn. You can retrieve the results of these assertions using the following APIs.

Get Assertions for Dataset

To retrieve all the assertions for a table / dataset, you can use the following GraphQL Query.

query dataset {
dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:snowflake,purchases,PROD)") {
assertions(start: 0, count: 1000) {
start
count
total
assertions {
urn
# Fetch the last run of each associated assertion.
runEvents(status: COMPLETE, limit: 1) {
total
failed
succeeded
runEvents {
timestampMillis
status
result {
type
nativeResults {
key
value
}
}
}
}
info {
type # Will be CUSTOM
customType # Will be your custom type.
description
lastUpdated {
time
actor
}
customAssertion {
entityUrn
fieldPath
externalUrl
logic
}
source {
type
created {
time
actor
}
}
}
}
}
}
}

Get Assertion Details

You can use the following GraphQL query to fetch the details for an assertion along with its evaluation history by URN.

query getAssertion {
assertion(urn: "urn:li:assertion:my-custom-assertion-id") {
urn
# Fetch the last 10 runs for the assertion.
runEvents(status: COMPLETE, limit: 10) {
total
failed
succeeded
runEvents {
timestampMillis
status
result {
type
nativeResults {
key
value
}
}
}
}
info {
type # Will be CUSTOM
customType # Will be your custom type.
description
lastUpdated {
time
actor
}
customAssertion {
entityUrn
fieldPath
externalUrl
logic
}
source {
type
created {
time
actor
}
}
}
}
}