Skip to content

๐Ÿ–‡๏ธ Managing the data model โ€‹

Each data model describes its:

  • Source of data: a database table or a SQL query
  • Fields: the columns present on this data model
  • Relations: relationships to other data models

Additionally, for certain kinds of data models that almost every B2B SaaS company has, such as accounts, users and analytics events, you can describe additional semantic properties about the data models.

These later make it even easier for anyone on your team to do complex things like cohort retention analysis, without having to worry about database tables or configuring anything.

TIP

Add prefix to YAML file with URL to Supersimple configuration schema definition that can be used by editor language server to benefit from auto-completions and code linting.

Your editor might also require an extension, like YAML for VSCode.

yaml
# yaml-language-server: $schema=https://assets.supersimple.io/supersimple_configuration_schema.json
models:
  account:
    name: Account

    semantics:
      kind: Account
      properties:
        created_at: created_at

    properties:
      account_id:
        name: Account ID
        type: String
      name:
        name: Name
        type: String
      payment_plan:
        name: Payment Plan
        type: Enum
        enum_options:
          load: static
          options:
            - value: enterprise
              label: Enterprise
            - value: pro
              label: Pro
            - value: basic
              label: Basic
            - value: free
              label: Free
      created_at:
        name: Created At
        type: Date

    table: raw_account
    primary_key:
      - account_id
    relations:
      users:
        name: Users
        type: hasMany
        model_id: user
        join_strategy:
          join_key: account_id
      onboarding_response:
        name: Onboarding Response
        type: hasOne
        model_id: onboarding_response
        join_strategy:
          join_key: account_id

Data source โ€‹

The data source can be defined in three ways. Only one may be used for a given model.

SQL โ€‹

yaml
models:
  somemodel:
    name: My model
    sql: select id, amount / 100 as amount_usd from x

Table โ€‹

yaml
models:
  somemodel:
    name: My model
    table: schemaname.tablename # use the whole table as-is

Properties โ€‹

Sets the properties (aka Fields) that a model has. Each field must be listed explicitly in order to be shown on the platform. Fields that are present in the data source but not in the properties list are not visible to users.

Properties can have the following type:

  • String
  • Enum
  • Boolean
  • Number
  • Integer
  • Float
  • Date

You can also specify a display format for properties of a certain type:

  • percentage
    • Float: 0.01 is displayed as 1.00%
  • eur
    • Number, Float, Integer: e.g. โ‚ฌ1,000.00 according to your device's locale settings
  • usd
    • Number, Float, Integer: e.g. $1,000.00 according to your device's locale settings
  • gbp
    • Number, Float, Integer: e.g. ยฃ1,000.00 according to your device's locale settings
  • date
    • Date: e.g.Oct 28th 2024 according to your device's locale settings
  • time
    • Date: e.g. 16:08 according to your device's locale settings
  • iso
    • Date: e.g. 2021-01-01T12:00:00Z
  • raw
    • All types: removes all formatting (e.g. numbers are by default formatted according to your device's locale settings). This is automatically applied to primary keys and join keys.

Relations โ€‹

Relations describe how different data models are linked together. Relations:

  • Encapsulate the semantic meaning of the relationships โ€“ย two data models might have several relationships between each other, with different meanings (e.g. a Person might have multiple relations to other Persons: friends and enemies)
  • Centrally define the SQL join logic

Relations are, by default, unidirectional. They are defined from the "base model" โ€“ย the data model from which you can use them. For example, for an User's Car, User would be the base model, and Car would be the related model.

hasMany โ€‹

Each row in the base model has zero or more (up to infinity) matches in the related model (e.g. Company->Employee)

yaml
models:
  ...
  company:
    ...
    relations:
      employees:
        name: Employees
        type: hasMany
        model_id: employee
        join_strategy:
          join_key_on_base: company_id
          join_key_on_related: company_id

hasOne โ€‹

Each row in the base model has exactly one match in the related model, or it has none at all (e.g. Employee->Employer)

yaml
models:
  ...
  employee:
    ...
    relations:
      employees:
        name: Employer
        type: hasOne
        model_id: company
        join_strategy:
          join_key_on_base: company_id
          join_key_on_related: company_id

manyToMany โ€‹

Functions just like hasMany; each row in the base model has zero or more matches in the related model (e.g. User->Team where every user can be in multiple teams and every team can have multiple users)

yaml
models:
  ...
  team:
    ...
    relations:
      users:
        name: Users
        type: manyToMany
        model_id: user
        join_strategy:
          through:
            model_id: supersimple_user_in_team
            join_key_to_base: team_id
            join_key_to_related: user_id
          join_key_on_base: team_id
          join_key_on_related: user_id

hasOneThrough โ€‹

Functions just like hasOne; the underlying database has an intermediary table (e.g. a Person's Grandfather is defined through Person->Parent->Parent)

yaml
models:
  ...
  person:
    ...
    relations:
      father: # This relation is used in the definition of the next relation
        name: Father
        type: hasOne
        model_id: person
        join_strategy:
          join_key_on_base: father_id
          join_key_on_related: person_id
      grandfather:
        name: Grandfather
        type: hasOneThrough
        model_id: person
        join_strategy:
          steps:
            - relation_key: father # This uses the relation defined above
            - relation_key: father

TIP

Note that you only need to define one level of relations between data models. It's always possible to later dynamically traverse through your entire data graph, e.g. going from accounts to their users, to the users' analytics events.

Metrics โ€‹

Metrics allow you to reuse calculation logic in a flexible way. Metrics correspond to a single base model, and can be used from anywhere that has access to that data model. A metric can also be broken down (grouped) by any of that data model's Fields.

In your models YAML file, you can define metrics as follows:

yaml
metrics:
  transaction_gmv:
    name: GMV
    model_id: transaction
    aggregation:
      type: sum
      key: amount

The Metric can then be used as described under summarization options.

Metrics with more complex logic โ€‹

Oftentimes, your Metrics will require more complex logic in addition to a single aggregation, such as filtering or even creating helper columns. For this, you can use any combination of our exploration steps (described in YAML) to define the logic of your Metric.

For example, you can define a Metric that only considers the GMV of Enterprise accounts:

yaml
metrics:
  transaction_enterprise_gmv:
    name: Enterprise GMV
    model_id: transaction
    operations:
      - operation: addRelatedColumn
        parameters:
          relation:
            key: account
          columns:
            - key: payment_plan
              property: # This is the property we will be able to access in the next step
                key: account_payment_plan # This is the key we will use to access the property
                name: Account Payment Plan # Human-readable name
      - operation: filter
        parameters:
          key: account_payment_plan
          operator: ==
          value: enterprise
    aggregation:
      type: sum
      key: amount
We can see that this Metric is indeed 0 for all other payment plans

See the Using operations section below for more information on how to structure the operations used here.

Using Operations โ€‹

You can also use the "no-code exploration steps" that you'd normally use in the UI to create new data models. While our YAML schema autocompletion will assist you with the syntax, you can also use our UI to get the YAML for any exploration block:

Click "Show YAML" in the dropdown menu of any block

Applying operations to raw data โ€‹

Here, we use the account database table as a base, define a relation called users and use that relation itself to add the calculated column Number of users right into the model:

yaml
models:
  account:
    name: Account

    table: account
    operations:
      - operation: relationAggregate
        parameters:
          relation:
            key: users
          aggregations:
            - type: count
              property:
                name: Number of users
                key: number_of_users

    properties:
      # List any properties here, except for ones
      # created by the above "operations". In this case:
      # `number_of_users` will be automatically
      # recognized as a property, and does not need to be
      # listed here manually.
      account_id:
        name: Account ID
        type: String
    relations:
      users:
        # This is the relation that we are using above
        name: Users
        # ... rest of the relation definition

As a result, you would see a data model like this in the UI: it would have the Number of users property, without showing any steps/operations in the sidebar:

Notice the rightmost column here: this was created using operations in the YAML file

TIP

The "operations" used are not visible in the sidebar as steps.

Building models on top of other models โ€‹

You can also define models "on top of" other models, instead of raw database tables or SQL queries. Here, we are building on top of the already-defined user model:

yaml
models:
  users_with_many_large_transactions:
    name: Users with many large transactions

    base_model: user # This is the model we are adding operations to
    operations:
      - operation: relationAggregate # We first create a column
        parameters:
          relation:
            key: transactions
          aggregations:
            - type: count
              property:
                name: Number of large transactions
                key: number_of_large_transactions
          filters:
            - parameters:
                key: amount
                  operator: ">"
                  value: 1000

      - operation: filter # And then use that column to filter
        parameters:
          key: number_of_large_transactions
          operator: ">"
          value: 100

    # All properties from the `user` base model are auto-included
    # so there's no need to doubly define them here.
    # All properties created by Operations are also auto-included.
    properties: {}

The resulting data model would then have the Number of large transactions property and only includes the filtered-down rows, without showing any steps/operations in the sidebar.

TIP

You can use this to, for example, create more specialized versions of existing data models, or to apply filtering that you want to always be present (preventing users from forgetting to apply it).

Difference between no-code steps and operations โ€‹

You'll notice slight differences in naming between the YAML format and the step names in the UI (for example: New column corresponds to multiple different operations in order to provide a clearer API).

Because of this, it's always easiest to use the UI to generate the relevant YAML wherever possible.