In the TrailmakerTM Insights module, there is a dataset repository where you can choose a publicly available dataset to explore.
Why use the Trailmaker dataset repository?
There are multiple ways to benefit from the Trailmaker dataset repository:
1. New users can quickly and easily grab a dataset to start exploring the features and functionalities of the platform, without having to wait for data upload and processing.
2. Datasets in the repository that are particularly relevant to your field of study can be used to validate your findings. For example, you could confirm the expression of a specific gene in a particular cell type, or confirm the presence of a specific cell type or state in a given tissue or disease.
3. You can integrate your own data with a relevant dataset from the repository in order to increase the power of your results. To do this, click ‘Explore’ for the relevant dataset in the repository, which triggers a new independent copy of the dataset to populate in your account. Then, you can add more samples to the project using the ‘Add data’ button. By increasing the number of samples in your project, some calculations including differential expression analysis can be more statistically powerful. When you do this, make sure to add the dataset source as metadata in order to assess differences between datasets.
How to access the dataset repository
The dataset repository can be accessed directly via this link:
https://app.trailmaker.parsebiosciences.com/repository
Alternatively, you can navigate to the Insights data management page, and then select ‘Create new project’ and choose ‘Select from Dataset repository’.
The dataset repository contains ~50 publicly available datasets, totalling >6 million cells. Some specific datasets to draw your attention to are:
- The Parse Biosciences Evercode™ v3 human immune cells (PBMCs) dataset that was used in the recent Trailmaker webinar demo is top of the list in the dataset repository.
- The dataset from the new Comparison of Evercode™ WT v3 and Chromium™ GEM-X Single Cell 3’ Kit v4 in Mouse Brain Nuclei tech note is second on the list in the repository.
All datasets in the repository are available to explore for free. Simply click ‘Explore’ on a dataset of your choice.
Clicking ‘Explore’ on any dataset within the repository opens up a drop-down with two options:
-
View: Viewers can explore all aspects of the project including data processing settings and plots, clusters and UMAPs, differential expression and a variety of plot visualizations. However, Viewers cannot change settings or clusters.
You should select this option if you want to quickly explore the platform features and/or project but don't need to make any changes to the project. It's fast to get started but has restricted access. -
Copy: By creating a copy, you will become the project Owner. Owners have full control over data processing settings, clusters including the generation of custom clusters, and all changes are saved. Note that copying large projects can take some time.
You should select this option if you want to make changes to the project, such as to Data Processing parameters or clusters. It's slower to get started (due to files being copied) but has access to all functionality in the platform.
How to cite a dataset from the Trailmaker dataset repository
If you use one of the datasets from Trailmaker’s dataset repository in further analysis which you then wish to publish, you should cite the original article and data source (as listed in the datasets repository page), as well as the Trailmaker application itself. Guidance and example citation statements are provided in our support suite.
How to suggest an addition to the datasets repository
If you’d like to suggest a publicly available dataset to be made available in the Trailmaker dataset repository, you can send the dataset details, including a link to the dataset analysis in your Trailmaker account, to support@parsebiosciences.com.