|
- Info
Dataset
-
Component description
A set of chemical compounds and assigned features
REST operations
Dataset
Description |
Method |
URI |
Parameters |
Result |
Status codes |
get list of datasets available |
GET |
/dataset |
Query (optional - to be defined) |
List of URI (datasets.xsd) |
200,404,503 |
create a new dataset |
POST |
/dataset/ |
None or Representation in a supported MIME format |
New URI /dataset/{id} |
200,400,503 |
get dataset |
GET |
/dataset/{id} |
Preferred MIME type |
Representation in one of supported MIME formats |
200,404,503 |
update dataset |
PUT |
/dataset/{id} |
Representation in a supported MIME format |
- |
200,400,404,503 |
remove dataset |
DELETE |
/dataset/{id} |
- |
- |
200,404,503 |
Chemical compounds in a dataset
Description |
Method |
URI |
Parameters |
Result |
Status codes |
get compounds |
HEAD |
/dataset/{id}/compound |
- |
List of URI to structures as in dataset.xsd |
200,404,503 |
get compounds |
GET |
/dataset/{id}/compound |
Preferred MIME type |
Representation in one of supported MIME formats |
200,404,503 |
get compound |
GET |
/dataset/{id}/compound{id2} |
Preferred MIME type |
Representation in one of supported MIME formats |
200,404,503 |
add compound |
POST |
/dataset/{id}/compound/ |
Representation in a supported MIME format |
New URI /dataset/{id}/compound/{id2} |
200,400,404,503 |
update compound |
PUT |
/dataset/{id}/compound/{id2} |
Representation in a supported MIME format |
- |
200,400,404,503 |
remove a compound from a dataset |
DELETE |
/dataset/{id}/compound/{id2} |
- |
- |
200,404,503 |
remove all compounds from a dataset |
DELETE |
/dataset/{id}/compound |
- |
- |
200,404,503 |
Conformers in a dataset (optional)
Description |
Method |
URI |
Parameters |
Result |
Status codes |
get conformers |
HEAD |
/dataset/{id}/compound/{id}/conformers |
- |
List of URI to conformers as in dataset.xsd |
200,404,503 |
get conformers |
GET |
/dataset/{id}/compound/{id}/conformers |
Preferred MIME type |
Representation in one of supported MIME formats |
200,404,503 |
get conformer |
GET |
/dataset/{id}/compound/{id2}/conformer/{id} |
Preferred MIME type |
Representation in one of supported MIME formats |
200,404,503 |
add conformer |
POST |
/dataset/{id}/compound/{id2} |
Representation in a supported MIME format |
New URI /dataset/{id}/compound/{id2}/conformer/{id3} |
200,400,404,503 |
update conformer |
PUT |
/dataset/{id}/compound/{id2}/conformer/{id3} |
Representation in a supported MIME format |
- |
200,400,404,503 |
remove conformers |
DELETE |
/dataset/{id}/compound/{id2}/conformer/{ids} |
- |
- |
200,404,503 |
remove all conformers |
DELETE |
/dataset/{id}/compound/{id2}/conformer |
- |
- |
200,404,503 |
Features in a dataset
Description |
Method |
URI |
Parameters |
Result |
Status codes |
get feature definitions |
HEAD |
/dataset/{id}/feature_definition |
- |
List of URI of features as in datasets.xsd |
200,404,503 |
get feature definitions |
GET |
/dataset/{id}/feature_definition |
- |
XML scheme for Feature Definition object |
200,404,503 |
get feature definition |
GET |
/dataset/{id}/feature_definition/{id2} |
- |
XML scheme for Feature Definition object |
200,404,503 |
add feature definition |
PUT |
/dataset/{id}/feature_definition/ |
XML scheme for Feature Definition object |
New URI /dataset/{id}/feature_definition/{id2} |
200,400,404,503 |
update feature definition |
PUT |
/dataset/{id}/feature_definition/{id2} |
XML scheme for Feature Definition object |
- |
200,400,404,503 |
remove feature_definition |
DELETE |
/dataset/{id}/feature_definition |
- |
- |
200,404,503 |
Actions on datasets (split, merge, subset)
Description |
Method |
URI |
Parameters |
Result |
Status codes |
split |
PUT |
? /split/dataset/{id}/ |
split parameters (e.g. crossvalidation folds) |
List of new dataset URI as in datasets.xsd |
200,404,503 |
merge |
PUT |
? /merge/dataset |
List of dataset URI to be merged as in datasets.xsd |
Merged dataset URI /dataset/{id} |
200,404,503 |
Alternative: split and merge can be considered as a special case of "create dataset" , with specific input parameters
create a new empty dataset |
PUT |
/dataset/ |
None |
New URI /dataset/{id} |
200,400,503 |
split an existing dataset |
PUT |
/dataset/ |
URI of dataset to split & parameters |
New URI /dataset/{id} |
200,400,503 |
merge datasets (union) |
PUT |
/dataset/ |
List of dataset URI to be merged as in datasets.xsd |
New URI /dataset/{id} |
200,400,503 |
Queries
Description |
Method |
URI |
Parameters |
Result |
Status codes |
given a compound, retrieve congeneric chemicals |
GET |
TODO |
- |
new URI /dataset/{id} |
200,404,503 |
given a compound, retrieve similar chemicals |
GET |
TODO |
- |
new URI /dataset/{id} |
200,404,503 |
retrieve chemicals that have data for a given endpoint |
GET |
TODO |
- |
new URI /dataset/{id} |
200,404,503 |
search within a dataset |
GET |
/dataset/{datasetid}/query |
Parameters TODO |
new URI /dataset/{id} |
200,404,503 |
more - TODO |
|
|
|
|
HTTP status codes
Interpretation |
Nr |
Name |
Success |
200 |
OK |
Dataset not found |
404 |
Not Found |
Incorrect MIME type |
400 |
Bad request |
Service not available |
503 |
service unavailable |
Dataset representation
XML schema for Dataset object
|
Efficient creation of datasets for validation purposes
* remove the split, merge and subset dataset-options
* add the following commands:
desc: copy a dataset while excluding compounds of the orig dataset
method: POST
uri: /dataset/{i}/copy
params: exclude_compounds (comma-separated list of compound-ids)
return: uri of new dataset
desc: copy a dataset while including compounds of the orig datset
method: POST
uri: /dataset/{i}/copy
params: include_compounds (comma-seperated list of compound-ids)
return: uri of new dataset
The old split and merge functions have the disadvantage that each dataset service has to provide this functions with the exact same functionality.
The new copy functions allow an efficient creation of test and training datasets (you do not have to add/remove each compound on its own), and ensure that the dataset-splits has to be implemented only once (by the validation component) and is easy to reproduce.