BLEval package

BEELINE Evaluation (BLEval) module contains the following BLEval.BLEval and three additional classes used in the definition of BLEval class

class BLEval.BLEval(input_settings: BLEval.InputSettings, output_settings: BLEval.OutputSettings)[source]

Bases: object

The BEELINE Evaluation object is created by parsing a user-provided configuration file. Its methods provide for further processing its inputs into a series of jobs to be run, as well as running these jobs.

computeAUC(directed=True)[source]

Computes areas under the precision-recall (PR) and and ROC plots for each algorithm-dataset combination.

directedFlag: bool
A flag to specifiy whether to treat predictions as directed edges (directed = True) or undirected edges (directed = False).
Returns:
  • AUPRC: A dataframe containing AUPRC values for each algorithm-dataset combination
  • AUROC: A dataframe containing AUROC values for each algorithm-dataset combination
computeEarlyPrec()[source]

For each algorithm-dataset combination, this function computes the Early Precision values of the network formed using the predicted top-k edges.

Returns:A dataframe containing the early precision values for each algorithm-dataset combination.
computeJaccard()[source]

Computes Jaccard Index between top-k edge predictions of the same algorithm.

Returns:A dataframe containing the median and median absolute deviation of the Jaccard Index values of each algorithm on the given set of datasets.
computeNetMotifs()[source]

For each algorithm-dataset combination, this function computes the network motifs such as Feedforward loops, Feedback loops and Mutual interactions in the predicted top-k network. It returns the ratio of network motif counts compared to their respective values in the reference network.

Returns:
  • FBL: A dataframe containing ratios of number of Feedback loops
  • FFL: A dataframe containing ratios of number of Feedforward loops
  • MI: A dataframe containing ratios of number of Mutual Interactions
computeSignedEPrec()[source]

For each algorithm-dataset combination, this function computes the Early Precision values separately for the activation and inhibitory edges.

Returns:
  • A dataframe containing early precision for activation edges
  • A dataframe containing early precision for inhibitory edges
computeSpearman()[source]

Finds the Spearman’s correlation coefficient between the ranked edges of the same algorithm on the given set of datasets.

Returns:A dataframe containing the median and median absolute deviation of the Separman’s correlation values of each algorithm.
parseTime()[source]

Parse time output for each algorithm-dataset combination.

Returns:A dictionary of times for all dataset-algorithm combinations
class BLEval.ConfigParser[source]

Bases: object

The class define static methods for parsing and storing the contents of the config file that sets a that sets a large number of parameters used in the BLEval.

static parse(config_file_handle) → BLEval.BLEval[source]

A method for parsing the input .yaml file.

Parameters:config_file_handle (str) – Name of the .yaml file to be parsed
Returns:An object of class BLEval.BLEval.
class BLEval.InputSettings(datadir, datasets, algorithms)[source]

Bases: object

The class for storing the names of input files. This initilizes an InputSettings object based on the following three parameters.

Parameters:
  • datadir (str) – input dataset root directory, typically ‘inputs/’
  • datasets (list) – List of dataset names
  • algorithms (list) – List of algorithm names
class BLEval.OutputSettings(base_dir, output_prefix: pathlib.Path)[source]

Bases: object

The class for storing the names of directories that output should be written to. This initilizes an OutputSettings object based on the following two parameters.

Parameters:
  • base_dir (str) – output root directory, typically ‘outputs/’
  • output_prefix – A prefix added to the final output files.

Submodules

BLEval.computeAUC module

BLEval.computeAUC.PRROC(dataDict, inputSettings, directed=True, selfEdges=False, plotFlag=False)[source]

Computes areas under the precision-recall and ROC curves for a given dataset for each algorithm.

Parameters:
  • directed (bool) – A flag to indicate whether to treat predictionsas directed edges (directed = True) or undirected edges (directed = False).
  • selfEdges (bool) – A flag to indicate whether to includeself-edges (selfEdges = True) or exclude self-edges (selfEdges = False) from evaluation.
  • plotFlag (bool) – A flag to indicate whether or not to save PR and ROC plots.
Returns:

  • AUPRC: A dictionary containing AUPRC values for each algorithm
  • AUROC: A dictionary containing AUROC values for each algorithm

BLEval.computeAUC.computeScores(trueEdgesDF, predEdgeDF, directed=True, selfEdges=True)[source]

Computes precision-recall and ROC curves using scikit-learn for a given set of predictions in the form of a DataFrame.

Parameters:
  • trueEdgesDF (DataFrame) – A pandas dataframe containing the true classes.The indices of this dataframe are all possible edgesin a graph formed using the genes in the given dataset. This dataframe only has one column to indicate the classlabel of an edge. If an edge is present in the reference network, it gets a class label of 1, else 0.
  • predEdgeDF (DataFrame) – A pandas dataframe containing the edge ranks from the prediced network. The indices of this dataframe are all possible edges.This dataframe only has one column to indicate the edge weightsin the predicted network. Higher the weight, higher the edge confidence.
  • directed (bool) – A flag to indicate whether to treat predictionsas directed edges (directed = True) or undirected edges (directed = False).
  • selfEdges (bool) – A flag to indicate whether to includeself-edges (selfEdges = True) or exclude self-edges (selfEdges = False) from evaluation.
Returns:

  • prec: A list of precision values (for PR plot)
  • recall: A list of precision values (for PR plot)
  • fpr: A list of false positive rates (for ROC plot)
  • tpr: A list of true positive rates (for ROC plot)
  • AUPRC: Area under the precision-recall curve
  • AUROC: Area under the ROC curve

BLEval.computeDGAUC module

BLEval.computeDGAUC.PRROC(dataDict, inputSettings, directed=True, selfEdges=False, plotFlag=False)[source]

Computes areas under the precision-recall and ROC curves for a given dataset for each algorithm.

Parameters:
  • directed (bool) – A flag to indicate whether to treat predictions as directed edges (directed = True) or undirected edges (directed = False).
  • selfEdges (bool) – A flag to indicate whether to includeself-edges (selfEdges = True) or exclude self-edges (selfEdges = False) from evaluation.
  • plotFlag (bool) – A flag to indicate whether or not to save PR and ROC plots.
Returns:

  • AUPRC: A dictionary containing AUPRC values for each algorithm
  • AUROC: A dictionary containing AUROC values for each algorithm

BLEval.computeDGAUC.computeScores(trueEdgesDF, predEdgeDF, directed=True, selfEdges=True)[source]

Computes precision-recall and ROC curves using scikit-learn for a given set of predictions in the form of a DataFrame.

Parameters:
  • trueEdgesDF (DataFrame) – A pandas dataframe containing the true classes.The indices of this dataframe are all possible edgesin a graph formed using the genes in the given dataset. This dataframe only has one column to indicate the classlabel of an edge. If an edge is present in the reference network, it gets a class label of 1, else 0.
  • predEdgeDF (DataFrame) – A pandas dataframe containing the edge ranks from the prediced network. The indices of this dataframe are all possible edges.This dataframe only has one column to indicate the edge weightsin the predicted network. Higher the weight, higher the edge confidence.
  • directed (bool) – A flag to indicate whether to treat predictionsas directed edges (directed = True) or undirected edges (directed = False).
  • selfEdges (bool) – A flag to indicate whether to includeself-edges (selfEdges = True) or exclude self-edges (selfEdges = False) from evaluation.
Returns:

  • prec: A list of precision values (for PR plot)
  • recall: A list of precision values (for PR plot)
  • fpr: A list of false positive rates (for ROC plot)
  • tpr: A list of true positive rates (for ROC plot)
  • AUPRC: Area under the precision-recall curve
  • AUROC: Area under the ROC curve

BLEval.computeEarlyPrec module

BLEval.computeEarlyPrec.EarlyPrec(evalObject, algorithmName, TFEdges=False)[source]

Computes early precision for a given algorithm for each dataset. We define early precision as the fraction of true positives in the top-k edges, where k is the number of edges in the ground truth network (excluding self loops).

Parameters:
  • evalObject (BLEval) – An object of class BLEval.BLEval.
  • algorithmName (str) – Name of the algorithm for which the early precision is computed.
Returns:

A dataframe containing early precision values for a given algorithm for each dataset.

BLEval.computeJaccard module

BLEval.computeJaccard.Jaccard(evalObject, algorithmName)[source]

A function to compute median pairwirse Jaccard similarity index of predicted top-k edges for a given set of datasets (obtained from the same reference network). Here k is the number of edges in the reference network (excluding self loops).

Parameters:
  • evalObject (BLEval) – An object of class BLEval.BLEval.
  • algorithmName (str) – Name of the algorithm for which the Spearman correlation is computed.
Returns:

  • median: Median of Jaccard correlation values
  • mad: Median Absolute Deviation of the Spearman correlation values

BLEval.computeJaccard.computePairwiseJacc(inDict)[source]

A helper function to compute all pairwise Jaccard similarity indices of predicted top-k edges for a given set of datasets (obtained from the same reference network). Here k is the number of edges in the reference network (excluding self loops).

Parameters:inDict (dict) – A dictionary contaninig top-k predicted edges for each dataset. Here, keys are the dataset name and the values are the set of top-k edges.
Returns:A dataframe containing pairwise Jaccard similarity index values

BLEval.computeNetMotifs module

BLEval.computeNetMotifs.Motifs(datasetDict, inputSettings)[source]

Computes ratios of the counts of various network motifs for each algorithm for a given dataset. The ratios are computed by dividing the counts of various network motifs in the predicted top-k network, to their respective values in the reference network.

Parameters:
Returns:

  • FBL: A dataframe containing ratios of three-node feedback loop motis
  • FFL: A dataframe containing ratios of three-node feedforward loop motis
  • MI: A dataframe containing ratios of two-node mutual interaction motis

BLEval.computeNetMotifs.getNetProp(inGraph)[source]

A helper function to compute counts of various network motifs.

Parameters:inGraph (:obj:networkx.DiGraph) – An graph object of class networkx.DiGraph.
Returns:
  • A value corresponding to the number of three-node feedback loops
  • A value corresponding to the number of three-node feedforward loops
  • A value corresponding to the number of two-node mutual interaction

BLEval.computeSignedEPrec module

BLEval.computeSignedEPrec.signedEPrec(evalObject, algorithmName)[source]

Computes median signed early precision for a given algorithm across all datasets, i.e., the function computes early precision of activation edges and early precision for inhibitory edges in the reference network. We define early precision of activation edges as the fraction of true positives in the top-ka edges, where ka is the number of activation edges in the reference network (excluding self loops). We define early precision of inhibitory edges as the fraction of true positives in the top-ki edges, where ki is the number of inhibitory edges in the reference network (excluding self loops).

Parameters:
  • evalObject (BLEval) – An object of class BLEval.BLEval.
  • algorithmName (str) – Name of the algorithm for which the early precision is computed.
Returns:

A dataframe with early precision of activation edges (+) and inhibitory edges (-) for a given algorithm

BLEval.computeSpearman module

BLEval.computeSpearman.Spearman(evalObject, algorithmName)[source]

A function to compute median pairwirse Spearman correlation of predicted ranked edges, i.e., the outputs of different datasets generated from the same reference network, for a given algorithm.

Parameters:
  • evalObject (BLEval) – An object of class BLEval.BLEval.
  • algorithmName (str) – Name of the algorithm for which the Spearman correlation is computed.
Returns:

  • median: Median of Spearman correlation values
  • mad: Median Absolute Deviation of the Spearman correlation values

BLEval.parseTime module

BLEval.parseTime.getTime(evalObject, dataset)[source]

Return time taken for each of the algorithms in the evalObject on the dataset specified. The output stored in time.txt is parsed to obtain the CPU time.

Parameters:
  • evalObject (BLEval) – An object of the class BLEval.BLEval
  • dataset (str) – Dataset name for which the time output must be parsed for each algorithm.
Returns:

A Dictionary of time taken by each of the algorithms, i.e., key is the algorithm name, and value is the time taken (in sec.).

BLEval.parseTime.parse_time_files(path)[source]

Return time taken for each of the algorithms in the evalObject on the dataset specified. The output stored in time.txt is parsed to obtain the CPU time.

Parameters:path (str) – Path to the time.txt file, or timex.txt file where x corresponds to the trajectory ID for a given algorithm-dataset combination.
Returns:A float value corresponding to the time taken.