Sync Data
Add facts in development
Whether starting fresh or making changes to existing rules, the quickest way to iterate on the facts stored in Oso Cloud is via the Fact Schema (opens in a new tab) in the UI. The Fact Schema lists the types of facts referenced in your policy; these are the types of facts Oso Cloud expects you to send.

To add a new fact, click + Add next to the type of fact you want to add. To
remove an existing fact, click ▼ Show matching facts and then click the Delete
button next to the fact you want to delete.
Sync facts in production
Oso Sync is only available for Startup and Growth plan customers.
Initial sync
Once you've decided how to represent your authorization data in Oso Cloud, you'll need to do a one-time sync to bring Oso Cloud in-line with your data. We provide Oso Sync to update the facts in Oso Cloud to match those in your application database.
You can use Oso Sync from the CLI with the oso-cloud reconcile command.
oso-cloud reconcile --perform-updates reconcile.yaml
Configuration
In order for Oso Sync to know where to find the facts you need, you need to create a configuration yaml file, which maps your data to facts in Oso Cloud. We currently support the following data sources:
PostgreSQL
version: 1source: postgresfacts:  has_relation(Repository:_, String:parent, Organization:_):    db: app_db    query: |-      select repository.public_id, organization.public_id      from repository      join organization      on organization.id = repository.organization_iddbs:  app_db:    connection_string: postgresql://oso:oso@somerds.instance.aws.com:5432/foo
The config file has two top level fields: facts and dbs.
- dbscontains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains a- connection_stringvalue, which needs to conform to a PostgreSQL connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a- $) containing the connection string:- connection_string: $ENV_VAR_NAME.
- factsmaps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore- _), which are filled by the query in order to generate the facts. For instance, the fact type- has_relation(Repository:_, String:parent, Organization:_)has two variables: one in the first argument for the- Repositoryand one in the third argument for the- Organization.- dbis the database that contains fact data for this fact type. Its value should match an identifier from the- dbssection.
- queryis the query to fetch all facts of that fact type. Match the columns you're fetching data from positionally with the variables in the fact type. In the example above,- repository.public_idis set as the Repository value in the first argument of the fact type, and- organization.public_idis set as the Organization value in the third argument.
 
MongoDB
version: 1source: mongodbfacts:  has_relation(Repository:_, String:parent, Organization:_):    db: app_db    collection: has_relation    fields:      - name: repository      - name: organization        is_array: true    query:      find: {}      # `find` and `aggregate` are mutually exclusive      # aggregate: []dbs:  app_db:    connection_string: mongodb://oso:oso@somemongo.instance.aws.com:27017/foo
The config file has four top level fields: version, source, dbs, and facts.
- versionshould have a value of- 1.
- sourceshould have a value of- mongodb.
- dbscontains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains a- connection_stringvalue, which needs to conform to a MongoDB connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a- $) containing the connection string:- connection_string: $ENV_VAR_NAME.
- factsmaps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore- _), which are filled by the query in order to generate the facts. For instance, the fact type- has_relation(Repository:_, String:parent, Organization:_)has two variables: one in the first argument for the- Repositoryand one in the third argument for the- Organization.- dbis the database that has the collection with the data for this fact type.
- collectionis the collection that contains data for this fact type.
- fieldsis an array containing the names of the fields to extract from the documents returned by the query. Each array item maps to the positional variable in the fact type, and all variables must be included. An item may have an optional- is_arrayfield; if- is_arrayis- true, the field on the document must be an array type and is automatically unwound. At most one field may be configured with- is_array: true.
- queryis the query to fetch all documents that contain data for the fact type. Either- findor- aggregatefield may be used for the query, and these are passed directly to the MongoDB- findand- aggregate, respectively. The example above illustrates a query using- find. For- aggregatequeries, use of the- $outstage results in an error.
 
Comma-separated Values (CSV)
version: 1source: csvfacts:  has_relation(Repository:_, String:parent, Organization:_):    fields:      - name: repository      - name: organization    path: /path/to/has_relation.csv
The config file has three top level fields: version, source, and facts.
- versionshould have a value of- 1.
- sourceshould have a value of- csv.
- factsmap fact types to the CSV file with the data of that type. Each fact type is defined with positional variable slots (specified by an underscore- _), which are filled with data from the corresponding values in the CSV file. For instance, the fact type- has_relation(Repository:_, String:parent, Organization:_)has two variables: one in the first argument for the- Repositoryand one in the third argument for the- Organization.- fieldsis an array containing the names of the values to extract from the CSV file. The first row in CSV file must be a header row and must include all of the items in the- fieldsarray. Each array item maps to the positional variable in the fact type, and all variables must be included.
- pathis the path to the CSV file with the data for the fact type.
 
Add and remove facts
Whenever you insert, update, or delete authorization-relevant data in your application, you should use Oso Cloud's Bulk API to mirror that update in Oso Cloud.
This "dual writes" approach is similar to updating an Elasticsearch index to provide up-to-date search results. Oso Cloud is a fast and flexible index for your authorization data that's optimized for producing sub-millisecond authorization decisions.
For example, in our GitCloud (opens in a new tab) example app, when a user creates a new repository, we send a pair of facts to Oso Cloud:
def create_repository(org_id):    org = Organization(org_id)    repo = Repository(payload["name"], org)    # Open a transaction to persist the repository to our datastore.    session.add(repo)    # Send facts to Oso Cloud.    with oso.batch() as tx:        # The parent organization of `repo` is `org`.        tx.insert(("has_relation", repo, "organization", org))        # The creating user gets the "admin" role on the new repository.        tx.insert(("has_role", current_user, "admin", repo))    # Once the bulk update to Oso Cloud succeeds, commit the transaction.    session.commit()    return repo.as_json(), 201
When deleting a repository, the process is identical, but the facts in the Bulk API call go in the removal array. Additionally, you can use wildcards to remove all facts matching a pattern:
with oso.batch() as tx:    # Remove all `has_relation` facts for the repository.    tx.delete(("has_relation", repo, None, None))    # Remove all `has_role` facts for the repository.    tx.delete(("has_role", None, None, repo))
Wildcards are represented as None in Python, null in JavaScript, nil in
Ruby, and so on.
When creating new resources, send corresponding facts to Oso Cloud before closing the local transaction. This way, we tell the user we’ve created the new resource once they’re able to access it.
When deleting existing resources, remove corresponding facts from Oso Cloud after closing the local transaction. We wait to remove access until we’re sure the resource no longer exists.
To add and remove facts in a single transaction — for example, when updating a
user's role from member to admin — use the Bulk API:
with oso.batch() as tx:    tx.delete(("has_role", user, None, repo))    tx.insert(("has_role", user, "admin", repo))
The Bulk API processes fact removals before additions, so after the above call
the user has exactly one role on the repository: admin.
Keep facts in sync
To ensure authorization data remains in sync with application data, it's good practice to periodically refresh the facts in Oso Cloud. You can use Oso Sync to identify any data drift as well as synchronize your application data to the facts in Oso Cloud.
Using the configuration file from the Initial Sync configuration,
- To compute the diff only, run:
oso-cloud reconcile reconcile.yaml
- To compute and apply the diff, run:
oso-cloud reconcile --perform-updates reconcile.yaml
This returns the diff over stdout. If the --perform-updates flag is passed,
the diff output represents the differences before applying the diff.
If 1000 or fewer facts have changed, Oso Sync returns the lists of facts to add or remove:
{  "type": "facts",  "fact_types": [    {      "fact_type": <Fact>,      "add": [<Fact>, ...],      "remove": [<Fact>, ...]    }  ]}
If more than 1000 facts have changed, Oso Sync returns the counts instead:
{  "type": "counts",  "fact_types": [    {      "fact_type": <Fact>,      "add_count": 501,      "remove_count": 500,    }  ]}
Oso Sync formats facts in their fully-expanded JSON representation.
Any variables in the fact type are represented by a null value:
{  "predicate": "has_relation",  "args": [    { "type": "Repository", "id": null },    { "type": "String", "id": "parent" },    { "type": "Organization", "id": null }  ]}
Oso Sync Limitations
- 
At most one Oso Sync command should be run at a time for a given environment. If multiple Oso Sync commands are run in parallel for an environment, you may see HTTP 419 errors. 
- 
The maximum size of the application data per fact type is 10GB. To synchronize larger data sets, you may consider "sharding" a single fact type across multiple fact type definitions in the YAML configuration by substituting a concrete value for one or more of the arguments. Before: has_relation(Repository:_, String:_, Organization:_): ...After: has_relation(Repository:_, String:parent, Organization:_): ...has_relation(Repository:_, String:child, Organization:_): ...
- 
The diff may include transient false positives due to our comparing a point-in-time snapshot of your database to Oso Cloud, which continues to receive changes. Transient false positives should not appear on successive invocations of Oso Sync and do not indicate issues with how your application updates facts in Oso Cloud. 
Docker
We publish a wrapped up version of the CLI (x86_64) for Oso Sync at public.ecr.aws/osohq/reconcile:latest.
To use it, build your own image on top of this using a Dockerfile like this:
FROM public.ecr.aws/osohq/reconcile:latestARG CONFIG_PATHRUN test -n "$CONFIG_PATH" || (echo "CONFIG_PATH argument must be set to path of your reconcile.yaml" && false)WORKDIR /appCOPY $CONFIG_PATH /app/config.yamlENTRYPOINT ["/app/reconcile", "experimental", "reconcile", "/app/config.yaml"]
Build it with: docker build -t reconcile-tool -f reconcile-tool.Dockerfile --build-arg="CONFIG_PATH=./reconcile.yaml" --platform linux/amd64 ..
Talk to an Oso engineer
If you'd like to learn more about using Oso Cloud in your app or have any questions about this guide, connect with us on Slack. We're happy to help.