> For the complete documentation index, see [llms.txt](https://docs.vbase.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.vbase.com/python-sdk/index/create_set.md).

# Creating a Dataset

This sample creates and validates a vBase set.

You can find the implementation in [`create_set.py`](https://github.com/validityBase/vbase-py-samples/blob/main/samples/create_set.py).

## Summary <a href="#summary" id="summary"></a>

A set is a collection of objects. A named set of data records is a dataset. Such datasets can implement any point-in-time (PIT) or bitemporal data and prove this provenance to third parties.

The sample illustrates low-level vBase set operations. Low-level set operations expose all vBase features and provide the most control without the benefit of simplifying higher-level abstractions.

## Detailed Description <a href="#detailed-description" id="detailed-description"></a>

* Create a vBase object using a Web3 HTTP commitment service. The commitment service is a smart contract running on a blockchain. The initialization uses connection parameters specified in environment variables:

  ```python
  vbc = VBaseClient.create_instance_from_env()
  ```
* Create the test set commitment. This operation records that the user with the above VBASE\_COMMITMENT\_SERVICE\_PRIVATE\_KEY has created the named dataset. Such commitments are used to validate that a given collection of user datasets is complete and mitigates Sybil attacks (<https://en.wikipedia.org/wiki/Sybil\\_attack>).

Set creation is idempotent. If the set already exists, duplicate calls are no-ops and return empty receipts. The returned receipt contains information on the set commitment. It can be optionally retained to simplify subsequent validation. All receipts are also available via vBase indexing services.

Since add\_set() and add\_named\_set() calls are idempotent, duplicate calls will be no-ops and will return empty receipts. `python receipt = vbc.add_named_set(SET_NAME) print("add_named_set() receipt:\n%s", pprint.pformat(receipt))`

* Verify that a given set commitment exists for a given user. This will typically be called by the data consumer to verify a producer’s claims about dataset provenance:

  ```python
  assert vbc.user_named_set_exists(vbc.get_default_user(), SET_NAME)
  ```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.vbase.com/python-sdk/index/create_set.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
