Getting Started
dataset.sh
is a dataset manager designed to simplify the process of installing, managing, and publishing datasets.
We hope to make working with datasets as straightforward as using package managers like npm or pip for programming
libraries.
Install
To get started, you can install dataset.sh
via pip:
pip install dataset.sh -U
dataset.sh --help
Not interested in publishing, but want to see what you can do with dataset? Jump to here
How to create and publish datasets
You can build and publish datasets using
-
Our dataset publishing framework, design to make the publishing experience as smooth as possible.
-
more low level and flexible but may require more work.
Examples
Tutorial: media datasets
in this guide, you will learn to how to bundle dataset with media files.
StartTutorial: synthetic data datasets
in this guide, you will learn to how to create and bundle dataset with synthetic data.
StartHow to import datasets
You can load the content of dataset by following the instruction on our dataset browser web ui.
dataset.sh gui hello/world