.. _geninputs: Generating inputs for BEELINE ############################## BoolODE provides additional scripts, located in ``scripts/``, to process simulation output for use with BEELINE. Compute Pseudotime using Slingshot ################################### .. note:: runSlingshot.py requires the Slingshot Docker container. Please make sure Docker has been set up and the container has been built. Slingshot computes pseudotime trajectories for a given dataset by first carrying out dimensionality reduction, then carrying out *k*-means clustering on the low dimensional embedding in order to compute trajectories. The number of clusters expected depends on the features of the dataset. For instance, a dataset with two steady state clusters should be specified with ``--nClusters 3``, specifying an additional initial state cluster. ``runSlingshot.py`` takes the following command line arguments: .. code:: text -h, --help show this help message and exit --outPrefix=OUTPREFIX Prefix for output files. -e EXPR, --expr=EXPR Path to expression data file -p PSEUDO, --pseudo=PSEUDO Path to 'pseudotime' file generated by BoolODE -c NCLUSTERS, --nClusters=NCLUSTERS Number of expected clusters in the dataset. --noEnd Do not force Slingshot to have an end state. -r PERPLEXITY, --perplexity=PERPLEXITY Perplexity for tSNE. .. note:: ``runSlingshot.py`` requires a 'pseudotime' file passed using the ``--pseudo`` option. This file should contain the actual simulation time from BoolODE, which can then be used to compare the quality of the inferred trajectory with the actual simulation time values. This file is NOT required by Slingshot itself. Generate dropouts from expression data ######################################## In order to mimic real single-cell expression datasets, BoolODE includes ``genDropouts.py`` which implements dropouts as described in the paper, by dropping expression values below ``DROP_CUTOFF`` using a probability of ``DROP_PROB``. This script samples ``NCELLS`` from the columns in the expression dataset ``EXPR``, and will throw an error if the number is greater than the number of columns. ``genDropouts.py`` takes the following command line arguments: .. code:: text -h, --help show this help message and exit --outPrefix=OUTPREFIX Prefix for output files. -e EXPR, --expr=EXPR Path to expression data file -p PSEUDO, --pseudo=PSEUDO Path to pseudotime file -r REFNET, --refNet=REFNET Path to reference network file -n NCELLS, --nCells=NCELLS Number of cells to sample. -d, --dropout Carry out dropout analysis? [Optional] --drop-cutoff=DROP_CUTOFF Specify percentile cutoff on gene expression --drop-prob=DROP_PROB Specify the probability of dropping a gene below quantile q. Ensure 0 < DROP_PROB < 1. -i SAMPLENUM, --samplenum=SAMPLENUM Sample Number .. attention:: Ensure the ``--dropout`` option is passed. If not, ``genDropouts`` will still randomly sample cells but will not drop out any values.