Licensed to be used in conjunction with basebox, only.
Schema Definition Guide
Overview
...basebox provides a lightweight but powerful annotation language to extend the functionality of a GraphQL schema to both generate a database and to query and mutation that database without having to build your own backend service.
The basebox Compiler (bbc) exposes the power of Postgres (PostgreSQL) via GraphQL. It does this by taking a novel approach of converting GraphQL to Postgres' Structured Query Language (SQL). In other words, you define the data structures of your application in a GraphQL schema file (with a sprinkling of additional annotations) and the Compiler creates the corresponding data model in Postgres from the schema. The Compiler also creates a few files that then goes into running our dbproxy server, this creates a GraphQL interface to the newly created database.
Sample Schema
Here's a short sample GraphQL schema with annotations:
type List {
id: ID!
title: String!
tasks: [Task]
}
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
list: List!
}
type Query {
getList(id: ID!): List
@bb_resolver(
_type: SELECT,
_object: List,
_filter: { id: { _eq: "$id" } })
}
type Mutation {
createList(title: String!): List
@bb_resolver(
_type: INSERT,
_object: List,
_fields: { title: "$title" })
createTask(title: String!, description: String, completed: Boolean!, listId: ID!): Task
@bb_resolver(
_type: INSERT,
_object: Task,
_fields: {
title: "$title",
description: "$description",
completed: "$completed",
list: { id: "$listId" }
})
updateTask(id: ID!, title: String!, description: String, completed: Boolean!, listId: ID!): Task
@bb_resolver(
_type: UPDATE,
_object: Task,
_filter: { id: { _eq: "$id" } }
_fields: {
title: "$title",
description: "$description",
completed: "$completed",
list: { id: "$listId" }
})
deleteTask(id: ID!): Task
@bb_resolver(
_type: DELETE,
_object: Task,
_filter: { id: { _eq: "$id" } })
}
Type System / Data Definition Language
The GraphQL Type system describes the capabilities of a GraphQL service and allows us to define a GraphQL schema that we then translate into SQL's Data Definition Language. GraphQL Object Types map directly to SQL Tables, with GraphQL fields in those objects becoming columns in the tables.
Some Definitions
GraphQL Schema
From the GraphQL specification:
A GraphQL service’s collective type system capabilities are referred to as that service’s “schema”. A schema is defined in terms of the types and directives it supports as well as the root operation types for each kind of operation: query, mutation, and subscription; this determines the place in the type system where those operations begin.
The GraphQL schema specifies the types available to caller's of the GraphQL service as well as the operations available to be called to either return data or perform some function. While the schema typically describes the service, fulfillment of the service still needs to be written by some other means (GraphQL does not specify how that should be developed) in a backend service.
At a very basic level, basebox provides a lightweight but powerful annotation language to extend the functionality of a GraphQL schema to both generate a database and to query and mutation that database without having to build your own backend service. The backend service is effectively generated from the GraphQL schema using the basebox compiler.
Schema Directives
GraphQL directives are a powerful aspect of GraphQL that provides a means to describe alternative behavior within a GraphQL document. basebox creates functionality within a GraphQL schema by extending the standard GraphQL directives with our own set of compiler directives. Note that, in order for directive names not to conflict with potential future directive names being added into the GraphQL specification, we start the basebox directive names with bb_
. This is in line with the GraphQL specification.
Defining Objects / Tables
To start getting into how this works, let us look at an example of a GraphQL schema:
type List {
id: ID!
title: String!
tasks: [Task]
user: User!
}
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
user: User!
list: List!
}
type User {
username: String! @bb_primaryKey
name: String
tasks: [Task]
lists: [List]
}
This represents the GraphQL objects of a very simple Todo list application. If we focus firstly on the List type, this will generate the following table:
CREATE TABLE "List" (
"id" UUID DEFAULT gen_random_uuid() NOT NULL,
"title" VARCHAR NOT NULL,
"user_username" VARCHAR NOT NULL
);
ALTER TABLE "List" ADD PRIMARY KEY ("id");
Notice firstly that List is in double quotes; this is because GraphQL types are case-sensitive (List and list are different names) while Postgres tables by default are not. They become case-sensitive when you put them in double quotes. This thus allows us to translate GraphQL names directly to SQL names, making it easy to see which GraphQL objects correspond to which SQL tables.
By and large we use this as a foundering principle, that the Compiler translate names as similarly as possible. You will notice the same with the GraphQL fields to SQL columns.
Primary keys
Next, let's look at the first field, the id, which is declared as a field of type ID in the GraphQL schema. The GraphQL ID is a scalar type that represents a unique identifier. The Compiler recognizes ID as potential for a primary key, and follows some basic rules for this.
If there is only one ID field in an object, the Compiler will recognize this as the primary key of the table. You will notice that it creates the id field as a UUID and default this to an automatically generated UUID (Notice also that it has also converted the ! to NOT NULL).
Why no auto-incremented numeric IDs?
This may seem weird for some users at first sight. The reason we do not use numeric primary keys is security: numeric IDs are too easy to guess. UUIDs in contrast are so random that they are virtually impossible to guess.
@bb_primaryKey
If there are no ID fields or more than one ID fields in a table, the Compiler will actually issue an error as it does not know what the primary key is for the table (note that we want a primary key to be able to identify a List later on).
This can be seen in the User object, where username is declared as a String. We have thus created the @bb_primaryKey
directive that lets you specify to the Compiler what the primary key is. The User object will be generated as this:
CREATE TABLE "User" (
"username" VARCHAR NOT NULL,
"name" VARCHAR
);
ALTER TABLE "User" ADD PRIMARY KEY ("username");
Foreign keys
The List object in Example 1 has a field defined as user: User!
, which is a reference to a single user in the User object (referring to the user that owns the list in real world terms). This translates to a foreign key or join in SQL and this is an example of a one-to-many join. Let's look at what happens in SQL:
CREATE TABLE "List" (
"id" UUID DEFAULT gen_random_uuid() NOT NULL,
"title" VARCHAR NOT NULL,
"user_username" VARCHAR NOT NULL
);
CREATE TABLE "User" (
"username" VARCHAR NOT NULL,
"name" VARCHAR
);
ALTER TABLE "List" ADD CONSTRAINT fk_list_1 FOREIGN KEY ("user_username") REFERENCES "User" ("username");
The user field has become the user_username column. The Compiler follows the link from List to User and sees that the identifying key (i.e. the primary key) for a User is username. It then creates a column in the List table as user_username which is a concatenation of the field name user and the primary key name of the table User.
In SQL terms this then says A list has one user and one user can have many lists
. Note that the corresponding lists: [List]
field in User has disappeared in SQL, this is unnecessary in SQL, and we will be able to join between the tables using just the user_username column (more on that when we look at Queries).
Notice that the tasks: [Task]
field has also disappeared, this is due to the same reason.
At this point, let's look at the entire SQL script generated for the GraphQL schema.
CREATE TABLE "List" (
"id" UUID DEFAULT gen_random_uuid() NOT NULL,
"title" VARCHAR NOT NULL,
"user_username" VARCHAR NOT NULL
);
CREATE TABLE "Task" (
"id" UUID DEFAULT gen_random_uuid() NOT NULL,
"title" VARCHAR NOT NULL,
"description" VARCHAR,
"completed" BOOLEAN NOT NULL,
"user_username" VARCHAR NOT NULL,
"list_id" UUID NOT NULL
);
CREATE TABLE "User" (
"username" VARCHAR NOT NULL,
"name" VARCHAR
);
ALTER TABLE "List" ADD PRIMARY KEY ("id");
ALTER TABLE "Task" ADD PRIMARY KEY ("id");
ALTER TABLE "User" ADD PRIMARY KEY ("username");
ALTER TABLE "List" ADD CONSTRAINT fk_list_1 FOREIGN KEY ("user_username") REFERENCES "User" ("username");
ALTER TABLE "Task" ADD CONSTRAINT fk_task_2 FOREIGN KEY ("user_username") REFERENCES "User" ("username");
ALTER TABLE "Task" ADD CONSTRAINT fk_task_3 FOREIGN KEY ("list_id") REFERENCES "List" ("id");
Type Conversion
Looking at Example 5, besides a field type of ID become UUID, you'll see that the GraphQL String has become SQL VARCHAR and GraphQL Boolean has become SQL BOOLEAN. The Compiler converts from GraphQL scalar types to a corresponding SQL type.
Here's a table of GraphQL types and their corresponding SQL data types, based on the GraphQL specification of scalar types:
GraphQL | SQL |
---|---|
Int | INTEGER |
Float | FLOAT8 |
String | VARCHAR |
Boolean | BOOLEAN |
ID | UUID |
Custom Scalar Types
GraphQL grants us the ability to extend the scalar type system using the scalar
keyword.
For example, you could declare a specific GraphQL type for UUID like so
and include a @specifiedBy directive to include a URL that explains the specification behind the type, like so
This however does not provide any tangible information to an implementing system on how to handle the type or what that type could correspond to in a Postgres database.
Date Types
Take for example Date and DateTime, GraphQL does not as yet include types for this. We have created two custom types to support this, corresponding to the appropriate SQL data types.
GraphQL | SQL |
---|---|
Date | DATE |
DateTime | TIMESTAMP |
@bb_scalar
We have also created the @bb_scalar
directive to allow the creation and usage of custom scalars with type information.
Here is an example of how to use the Date and DateTime types we've created:
scalar DueDate @bb_scalar(_type: "Date")
scalar TimeStamp @bb_scalar(_type: "DateTime")
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
user: User!
list: List!
due_date: DueDate
created_on: TimeStamp!
}
The Compiler will create corresponding SQL as such:
for new domain types called "DueDate" and "TimeStamp" (note the double quotes for case-sensitivity) with the two columns:
Base64
We also support the Base64
custom type that stores binary data. This type is stored as a BYTEA
data type in Postgres.
Enumerated Types
GraphQL has an enumerated type (enum) that corresponds quite closely to a Postgres enum type. We also create the corresponding SQL enumerated type based on the GraphSQL declaration.
Defining Operations
There are three types of GraphQL operations, namely query, mutation and subscription, that a GraphQL service provides. The Compiler supports generation of SQL queries and mutations from GraphQL (subscriptions are not currently supported).
Resolvers
A resolver is a function that fulfills a GraphQL operation request by populating the required data. The Compiler allows for resolvers to be defined within the GraphQL schema itself.
@bb_resolver
directive
Resolvers are attached as @bb_resolver
directives to each GraphQL query and mutation operation. Let's extend the Todo application GraphQL definition from Example 1 with the following Query type:
type Query {
getUser(
username: String!
): User @bb_resolver(_type: select, _object: User, _filter: { username: { _eq: "$username" } })
}
The second argument we can see is the _object which refers to object being operated on, User in this case.
Lastly we see the _filter argument, this allows us to filter on the fields of an object. In this case its telling the Compiler that this operation will filter the User object on the username field, with the value equal to (that's the _eq part) \$username. The \$username refers to the username argument of the getUser operation, using the dollar sign allows us to refer to arguments of the operation in our resolver directive. The \$username does not have to be called username, it can be called whatever the operation argument name is called.
The _filter argument is specified in GraphQL object format, i.e. using {} to open and close sections.
When GraphQL is compiled, the Compiler will create a corresponding SQL statement for this operation and store a version of this in a _resolver.toml file that will be used by the dbproxy when GraphQL (operation) requests are received.
Comparison operators
Before we look at resolvers for Mutations, let's look at the filter section in more detail, starting with the _eq part of it. As mentioned, this is a comparison operator that corresponds to an equals to and can be used in the same way. We have created several more operators, described in the following table:
operator | description |
---|---|
_eq | equals to |
_neq | not equals to |
_gt | greater than |
_gte | greater than or equal to |
_lt | less than |
_lte | less than or equal to |
_in | in |
_nin | not in |
_like | like |
_nlike | not like |
The _like and _nlike operators correspond to SQL like and not like statements. These perform a string comparison on a field/column. You can make use of an % to specify a wildcard e.g. username: { _like: "B%" }
for all usernames beginning with the letter B.
Implicit AND
We might also want to get all tasks for a user that are not complete, this would require us to select records based on two fields. We can do this as follows:
type Query {
getUserTasks(
userName: String!
): Task @bb_resolver(_type: select, _object: Task, _filter: { user: { username: { _eq: "$userName" } }, completed: { _eq: false } })
}
This example searches the Task object/table for all records where the user has a username of \$user AND completed is equal to false. The Compiler does a logical AND between the two conditions, meaning both has to be true in order for a result to be returned.
Specifying an object in a filter
You might have noticed that this part user: { username: { _eq: "$userName" } }
specifies a field within the user (which is a User object). Because the Task object does not have a field called username but rather has a field called user, we need some way to refer to the username in the User object.
The filter criteria has to ultimately refer to a scalar field, and in our case, a scalar field that will allow us to reference the user which is being addressed. Because we have specified the primary key for the User object, the Compiler will expects a referencing username. Currently this only works for specifying the primary keys but in future we will allow for other fields (like name).
We will explore this in more detail once we look at Operation Handling.
Mutation Resolvers
The Compiler supports three mutations operations; insert, update and delete.
Insert Mutations
Let's look at insert first:
createTask(
title: String!,
description: String,
completed: Boolean!,
listId: ID!
userName: String!
): Task @bb_resolver(
_type: INSERT,
_object: Task,
_fields: {
title: "$title",
description: "$description",
completed: "$completed",
list: {
id: "$listId"
},
user: {
username: "$userName"
}
}
)
This operation creates a new task. The _type is INSERT and the _object is Task. We introduce the _fields argument, this specifies the fields that will be inserted when creating a new task. The fields include links to a List and User.
The _fields argument accepts an object with the name of the fields and a reference to the argument from the operation. This can also be a constant value, for example we could remove the completed argument from the operation and change the completed entry in the _fields resolver argument to read completed: false
(a Boolean in GraphQL does not need double quotes like a String does).
Update Mutations
Now let's take a look at an update operation:
updateTask(
id: ID!,
title: String,
description: String,
completed: Boolean,
listId: ID
): Task @bb_resolver(
_type: update,
_object: Task,
_filter: { id: { _eq: "$id" } },
_fields: {
title: "$title",
description: "$description",
completed: "$completed",
list: { id: { "$listId" } }
}
)
This can of course be something a little broader, for example, you could specify a filter criteria like _filter: { user: { username: { _eq: "$userName" } } }
which would allow for the update of all tasks of the user specified by \$userName.
The _fields, similar as the case with the insert, indicate which fields to affect.
Delete Mutations
A delete operation deletes one or more records from a database. Here's an example:
cleanUp(userName: String!): Task @bb_resolver(_type: delete, _object: Task, _filter: { user: { username: { _eq: "$userName" } }, completed: { _eq: true } })
This will delete all completed tasks for the specified user. Delete does not have a _fields argument of course, entire rows are deleted according to the filter criteria.
Input Objects
Before we continue with more complicated resolver types, it might be worthwhile at this point to talk a little about GraphQL input objects. When you want to pass an object as an argument to an operation, you cannot, in GraphQL, use an object type that you've created. These objects are meant only for output (these are used to specify the output expected from an operation/request). GraphQL has input objects that were specifically created to specify objects that are passed into an operation. Let's have a look at an example:
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
}
input inputTask {
title: String!
description: String,
completed: Boolean!
}
insertTask(input: inputTask!): Void
@bb_resolver(
_type: INSERT,
_object: Task,
_fields: {
title: "$input.$title"
description: "$input.$description"
completed: "$input.$completed"
}
)
Nested Insert Mutations
Let's consider the following example:
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
notes: [TaskNote]
user: User!
}
type TaskNote {
id: ID!
note: String!
}
In this example, you will notice that TaskNote does not contain a reference to Task. This means there's actually no way to reference the Task when inserting the a TaskNote. Furthermore, even if you could reference the Task in TaskNote, sending the id back to the client application and then sending a list of task notes back to the server thereafter is quite a round about route to take. This is were a nested insert would come in.
This scenario is often referred to as a master-detail relationship, with Task being the master and TaskNote being the detail. There are quite some occasions where you might want to insert them together, in one operation.
Let us look at how you would specify a nested insert operation here:
input inputTaskNote {
note: String!
}
input inputTask {
title: String!
description: String,
completed: Boolean!
}
input inputTaskWithNotes {
task: inputTask!
notes: [inputTaskNote!]
}
insertTaskWithNotes(input: inputTaskWithNotes!): Void
@bb_resolver(
_type: INSERT_NESTED,
_object: Task,
_fields: {
_insert: true
title: "$input.$task.$title"
description: "$input.$task.$description"
completed: "$input.$task.$completed"
notes: [
{
_insert: true
note: "$input.$notes.$note"
}
]
user: {
_select: true
id: "@currentUserId"
}
}
)
_insert: true
and _select: true
, these specify whether we're inserting into a detail object/table or whether this is a reference to another object (in this case to a user). If we changed the user _select: true
to a _insert: true
, the operation will attempt to insert a new user (this might fail if there are mandatory fields that are missing).
Also note in the above example that the operation will, in SQL, insert the Task entry first, receive a new id for the task, and use that id to create the list of TaskNote using the Task id to associate task notes to task (in SQL terms, the task id becomes a foreign key into the TaskNote table). This all happens in one SQL statement generated by the Compiler.
Nested Update Mutations
Let's modify the object types specified in the example we've been working on above:
type Task {
id: ID!
title: String!
description: String,
completed: Boolean!
note: TaskNote
user: User!
}
type TaskNote {
id: ID!
note: String!
}
notes: [TaskNote]
note list to a single note: TaskNote
. This means that, instead of a task have a list of notes, a task will now have a single note. In SQL terms, in the previous example, Task id would have been a foreign key column in the TaskNote table. In this case, in SQL terms, the TaskNote id would now be a foreign key column into the Task table. Put another way, if note in the Task object needs to be updated, you might choose to insert a new note and update the Task object with the new TaskNote reference.
So, let's have a look at how this works. Introducing the UPDATE_NESTED
resolver type:
insertTaskWithNotes(input: inputTaskWithNotes!): Void
@bb_resolver(
_type: UPDATE_NESTED,
_object: Task,
_fields: {
_update: true
title: "$input.$task.$title"
description: "$input.$task.$description"
completed: "$input.$task.$completed"
note: {
_insert: true
note: "$input.$notes.$note"
}
}
)
Void Operations
There are occasions when you don't want to an operation to return any values. For example, you might a delete operation and, once the record is delete, there is nothing you'd want returned. By default, GraphQL does not actually provide an ability to do this, all operations must return a valid type with data. We have work around this by creating a new scalar type called Void
.
To use the Void
scalar type, you first need to declare your own custom type using the Void
like so:
You can then use this in when defining a new operation:
updateTask(id: ID!, title: String, description: String): Void
@bb_resolver(
_type: update,
_object: Task,
_filter: { id: { _eq: "$id" } },
_fields: {
title: "$title",
description: "$description"
}
)
Void
operation will simply return data: None
if the operation was successful.
Orchestrator Resolvers
Orchestrators are a type of resolver that allows an operation to be resolved by joining multiple other operations together. This is useful for ease of use (calling one operation instead of multiple), to create more complicated functionality or to create a sequence of operations that execute under the same transaction.
Defining Orchestrators
Let us use an example of replacing a task. Replacing a task might entail marking an existing task as completed and insert a new task in its place. So the insert and update for the task object might be written like this (similar to the examples above):
createTask(
title: String!,
description: String,
completed: Boolean!,
list: List!
user: User!
): Task @bb_resolver(
_type: insert,
_object: Task,
_fields: {
title: "$title",
description: "$description",
completed: "$completed",
list: "$list",
user: "$user"
}
)
updateTaskCompleted(id: ID!): Task @bb_resolver(
_type: update,
_object: Task,
_filter: { id: { _eq: "$id" } },
_fields: {
completed: true
}
)
replaceTask(
old_id: ID!,
title: String!,
description: String,
completed: Boolean,
list: List,
user: User!
): Appointment
@bb_resolver(
_type: ORCHESTRATOR
_steps: [
{
_mutation: "updateTaskCompleted"
_arguments: {
id: "$old_id"
}
}
{
_mutation: "createTask"
_arguments: {
title: "$title",
description: "$description",
completed: "$completed"
list: "$list",
user: "$user"
}
}
]
)
Operations are executed sequentially and by default fall under one database transaction - meaning that if either operations fails, the other will fail as well. Note though that if you include an operation that has a call to an external microservice, this would not be covered under the same transaction.
Data Access & Security
basebox includes a data access and security model called a Data Ownership model. This covers cases where the user who created the data owns the data. Users are then the only ones to have access to their data (unless they grant access to their data to someone else or your agreement with the user specified they grant you access to their data). This model allows data to remain private and to only be accessible by users with the correct access rights.
Let's consider the following example which expands on Example 1:
type List @bb_owned {
id: ID!
title: String!
tasks: [Task]
user: User!
}
type Task @bb_owned {
id: ID!
title: String!
description: String,
completed: Boolean!
type: TaskType!
user: User!
list: List!
}
type TaskType {
id: ID!
name: String!
}
type User @bb_user {
username: String! @bb_primaryKey
name: String
tasks: [Task]
lists: [List]
}
@bb_owned
directive to show that they are objects/tables where records within them are owned by a specified user. When this directive is added, the corresponding List and Task tables are created with a .ownerId
column. This columns stores the user id (this is the user id of the logged in user which is received in the access token from the IdP) of the user who creates the record (indicating that the user owns this record). Any selects, updates, or deletes run against these objects/tables will automatically filter by .ownerId
field as well to make sure that only users that have access to the record view that record.
The TaskType object does not need the @bb_owned
directive, a task type would not be owned by one user, this will be used by all users.
The User object does not have the @bb_owned
directive by has the @bb_user
directive instead. This implies @bb_owned
but goes further by also mapping the IdP user (from Auth0 or KeyCloak for example) maps 1-to-1 with a user in this table. This indicates that a specific user in the IdP is also a specific user in the database. @bb_user
will create a unique constraint across username
(being the primary key of the User table) and .ownerId
(being the user id from the IdP). Note that in the access token, the user id is in the sub
field.
The @bb_user
also serves another purpose. Let's look at this operation definition:
getUser: User @bb_resolver(_type: select, _object: User, _filter: { username: { _eq: "@currentUserId" } })
username
to the getUser
operation, this will call getUser
for the currently logged in user. The @currentUserId
is special basebox directive that indicates that the system should take the sub
from the access token (i.e. the IdP user), search for the corresponding username
in the User table and use that as the value for @currentUserId
. This will thus return the correct values for the logged in user. This also alleviates a security issue - if getUser
had username
as the argument, anyone with access to the getUser
operation could potentially call the operation for any user, as long as they had the username of that user.