DOME Registry

Index

Score

SectionScore

dataset 4 / 4 (100%)

optimization 8 / 8 (100%)

model 4 / 4 (100%)

evaluation 3 / 5 (60%)

total 19 / 21 (90%)

The DOME score is computed as the number of valid fields, divided by the total number of fields
DOME-id: dfyn1yvtz3

Publication

Title

Title of the article

Authors

Authors which contributes to the article

Publication journal

Name of the journal where the paper has been published

Publication year

Year of publication, as a number

DOI

DOI of the published article

Dataset

Dataset source

Source of the dataset?
Number of data points?
Data used in previous paper and/or by community?

Dataset splits

How many data splits?
How many data points in each split?
If number of data splits is greater than two (2), what where other splits? (e.g. cross-validation, validation set, independent test)
What is the distribution of data points in each data split? (e.g. number of + and - cases in classification or frequency distribution in regression)?

Dataset redundancy

How were the datasets split?
Are the training and test sets independent?
How was this enforced (e.g. redundancy reduction to less than x% pairwise identity)?
How does the distribution compare to previously published ML datasets in the biological field?

Dataset availability

Are the data, including the data splits used, released in a public forum?
If yes, where (for example, supporting material, URL) and how (license)?
How was this enforced (e.g. redundancy reduction to less than x% pairwise identity)?

Optimization

Optimization algorithm

What is the machine-learning algorithm class used?
Is the machine-learning algorithm new?
If it is a new ML algorithm, why was it not published in a machine-learning journal, and why was it chosen over better known alternatives?

Is the algorithm a meta-predictor?

Does the model use data from other machine-learning algorithms as input (i.e. it is a meta-predictor)?
If it is a meta-predictor, which machine-learning methods constitute the whole?
If it is a meta-predictor, is it completely clear that training data of initial predictors and meta-predictor is independent of test data for the meta-predictor?

Data encoding

How was the data encoded and pre-processed for the machine-learning algorithm?

Input parameters

How many parameters (p) are used in the model?
How was p selected?

Input features

How many features (f) are used as input?
Was feature selection performed?
If feature selection performed, was it done using the training set only?

Fitting method

Is the number of parameters (p) much larger than the number of training points and/or is the number of features (f) large (e.g. in classification is p>>(Npos+Nneg) and/or f>100)?
If yes to previous question, how was over-fitting ruled out?
Conversely, if the number of training points seem very much larger than p and/or f is small how was under-fitting ruled out?

Regularization method

Were any over-fitting prevention techniques performed (e.g. early stopping using a validation set)?
If yes, which ones?

Configuration available

Are the hyper-parameter configurations, optimization schedule, model files and optimization parameters reported available?
If yes, where (e.g. URL) and how (license)?

Model

Interpretability

Is the model blackbox or transparent?
If the model is transparent, can you give clear examples for this?

Output

Is the model classification or regression?

Execution time

How much time did it take for the model to run

Software availability

Is the source code released?
Is a method to run the algorithm such as executable, web server, virtual machine or container instance released?
If yes to public release, where (e.g. URL) and how (license)?

Evaluation

Evaluation method

How was the method evaluated? (E.g. cross-validation, independent dataset, novel experiments)

Performance measures

Which performance metrics are reported?
Is this set of metrics representative (e.g. compared to the literature)?

Creation date

Methods comparison

Was a comparison to publicly available methods performed on benchmark datasets?
Was a comparison to simpler baselines performed?

Evaluation confidence

Do the performance metrics have confidence intervals?
Are the results statistically significant to claim that the method is superior to others and baselines?

Evaluation availability

Are the raw evaluation files (e.g. assignments for comparison and baselines, statistical code, confusion matrices) available?
If public released, where (e.g. URL) and how (license)?