dbt

The dbt Labs AI Development Principles

The dbt Labs AI Development Principles

We believe that Generative AI, Large Language Models, Machine Learning technologies, and other automation and computer reasoning tools—for simplicity, let’s just call these AI Technologies—are capable of significantly enhancing the experience of dbt Cloud users. Our product and engineering teams are working with a number of these AI Technologies today, looking for ways to make data transformation more accessible and efficient. We know that we can only succeed in our mission if our users trust us to build and release features that use these AI Technologies safely and we see transparency as our North Star, so in that vein we want to share with you our intentions and principles for leveraging AI Technologies in our platform including your ability to control the use of these technologies for your account. For clarity, our platform means dbt Cloud and not customer support, documentation search, or internal tooling that we may use ourselves.

A Note About “Data”

Our Terms of Use define two categories of “data” and understanding them is important for knowing how we intend to leverage AI Technologies. The first category is “Client Data” which includes materials, information, or content that users share or input into dbt Cloud. This category can include your queries, column names, table names, lineage, data previews, and any table data. We treat this data as confidential and we protect it as if it were our own sensitive data. We will only use Client Data to provide services to the specific client that uploaded that data.

The second category is “Platform Data” which includes data generated by dbt Cloud in the service of processing user requests. These may be SQL commands that don’t include any specific data from a customer table or performance metrics, system logs, or anonymized usage statistics. This data is not confidential between us and may be used for purposes of improving our platform and the user experience for all dbt Cloud users.

When using dbt Cloud to run transformations, the data from your data warehouse remains in your data warehouse. We utilize AES-256 for all data encrypted at rest and TLS 1.2+ for data in transit. For more information about our general security practices and our practices specific to data processing, please see our Security page.

AI Partners and Engines

The first features we are releasing that leverage AI Technologies are powered by third-party large language model providers. Where we contract with a third-party provider for these services, we do not allow them to use any of our client data or platform data to train their models. In addition, we require that any data sent to a third-party provider, including prompts that we write and data that you supply, may only be retained for a short period of time in their systems.

Model training

We may use your Client Data to train our AI models to improve your experience with the platform and not for the benefit of other users. We will not allow your Client Data to be used to train third-party provider models. We do not currently use Client Data for any internal model training, but if we introduce those features in the future, we will provide a disclosure to you through the dbt Cloud application or direct communications before enabling the features and allow you to opt-out of using our AI features.

We may use Platform Data to train internal AI models to improve the platform generally. This data will be de-identified and aggregated before being used for model training. We see this data as providing opportunities to improve our overall platform performance. We do not currently use Platform Data for any internal model training, but should we introduce those features in the future, we will inform you. You will not have an ability to opt-out of this type of training.

AI Account Settings

Account administrators can control the use of AI Technologies in our platform for account users and model training via toggles on the account settings screen. If an account chooses to opt-out of AI features, the UI elements for those features will still be visible but interaction with elements will result in a message to the user that the feature is not available in their account and they should engage with an account administrator to enable it. Administrators can change the AI account settings at anytime. These settings control:

  • Access to AI features
    • AI features are set to “opt-in” by default for all accounts in all plans unless we have a prior agreement with you to disable these features in your account.
    • Enabling the AI features is an all-or-nothing option. That is, you cannot choose to opt-out of certain AI features and opt-in to others.
  • Model training
    • We will never allow third-party providers to train their models using either your Client Data or our Platform Data. For internal models, because we do not currently do any training of internal models, there are no options yet. If we do release a feature that uses Client Data to train models, there will be an option to opt-out of training available to account administrators.
AI Cloud settings in dbt Cloud

Enterprise customer agreements that are governed by MSAs are more complex and may include specific requirements related to these features. Some customers may have stringent regulatory or compliance requirements related to their data and would prefer to opt-out. If your agreements with us specify that your account should be opted-out of AI features, we will administratively opt the account out. An account administrator can opt-in at any time.

Future Features

We may release new AI features that are sold as an add-on to the base dbt Cloud platform. If we do this, the principles above will still apply and your Client Data will be treated confidentially and only used to train models for your specific use. We will be clear and transparent about the behavior of these features as they are released.