Asynchronous search tasks
Up to MadFast version 0.3.3
similarity search requests to the REST API were processed synchronously only: when the request was received by the server the search was launched and the response of the request was composed from the search results. For long running searches no cancellation or progress observing is available. The synchronous endpoints are still available, however new, experimental asynchronous capabilities were introduced in 0.3.4
.
Please note that asynchronous task handling is highly experimental currently. Incompatible changes are expected in future releases. Please contact us if you plan to use these asynchronous enthis extension library to discuss your use case and compatibility requirements.
As a reference see the typical synchronous call to the REST API:
Key concepts for asynchronous tasks
-
Interactions coordinated using synchronous
POST
,GET
andDELETE
HTTP requests. Other techniques like web sockets are not utilized currently. -
Endpoints are introduced to launch synchronous tasks with JSON request parameter objects. See
/rest/descriptors/{desc}/find-most-similars (POST)
as an exemple. For these endpoint a single request parameter objectMostSimilarsRequest
is expected asapplication/json
. The response is also a single result objectMostSimilarsResult
. -
Dedicated endpoints introduced to launch async tasks, like
/rest/descriptors/{desc}/find-most-similars-async
. -
The async task launcher endpoints expect
POST
requests where the request body is the same JSON request object expected by the synhronous counterpart -
The async task launcher endpoints response is the task metadata object (
AsyncCallDto
) containing task ID, progress info and result (when finished)/partial result (when implemented). Upon completion the status will contain the same result structure in fieldresult
as the synchronous counterpart. Implementations might report partial results as well. -
A server side abstract resource (
/rest/experimental-async-calls
) is introduced to represent running tasks. This resource is used to poll progress.
Overview of an asynchronous call:
Additional details
-
The async call representation objects are automatically deleted after a timeout. REST API clients are expected to poll the status periodically.
-
A cancel request can be sent to an asynchronous task with
/rest/experimental-async-calls/{id}
. -
Async task launcher endpoints (like
/rest/descriptors/{desc}/find-most-similars-async
) accept a query parametersync-result-timeout-ms
. The REST server will wait the specified amount of time (in milliseconds) instead of returning immediately. If the underlying task completes during this time period then the returnedAsyncCallDto
will contain the final calculation result and no separate task object is created.
Web UI usages
Similarity search results and dissimilarity distribution display components use asynchronous endpoints rest/descriptors/{desc}/find-most-similars-async
and rest/descriptors/{desc}/distribution-async
.
The current WebUI displays progress bar during longer searches as well it will send cancel requests when new structure is entered.
REST API examples
The following REST API examples use diagnostic tool (part of the distribution tool codebase) com.chemaxon.overlap.wui.SlowUnguardedComparator
). When injected as a comparator this class introduces a specified delay into every similarity comparison. The following examples use the drugbank
dataset (of around 7k molecules).
Prepare running server:
# Import MMS
bin/createMms.sh -in data/molecules/drugbank/drugbank-common_name.smi.gz -out mms.bin
# Calculate fingerprint; inject slow comparator wrapper as the default comparison
# Note that the similarity search internally will further group pages together
# Specify pagesize 1 in order to avoid grouping all targets together a single page
bin/buildStorage.sh \
-in data/molecules/drugbank/drugbank-common_name.smi.gz \
-out fp.bin \
-context createSimpleCfp7Context \
-contextjs 'ctx.pagesize(1).unguarded(
ctx.getUnguardedExtractor(),
new Packages.com.chemaxon.overlap.wui.SlowUnguardedComparator(
ctx.getUnguardedDissimilarityCalculator(), 1
)
);'
# Launch embedded server
# Specify to use only a single working thread for search execution
bin/gui.sh \
-tp 1 \
-allowedOrigins "*,*" -nobrowse -port 8085\
-mols -mms:mms.bin:-name:m \
-desc -desc:fp.bin:-name:slow-fp:-mols:m
Launch similarity search
echo "******************************************************"
echo "Launch request"
echo "******************************************************"
echo
# Note that we must send a JSON request object
# Option -sS makes curl hide its progress but show errors
curl \
-sS \
-X POST \
-H 'Content-Type: application/json' \
-d '{ "query":"C1CCCCC1", "maxCount":1 }' \
"http://localhost:8085/rest/descriptors/slow-fp/find-most-similars-async" | python -m json.tool
# Get status immediately, after 2 and 10 seconds
echo "******************************************************"
echo "Task status after launched"
echo "******************************************************"
echo
# We "know" that the first assigned task ID (after server startup) will be "AR0000"
curl -sS "http://localhost:8085/rest/experimental-async-calls/AR0000" | python -m json.tool
sleep 2
echo "******************************************************"
echo "Task status while running"
echo "******************************************************"
echo
curl -sS "http://localhost:8085/rest/experimental-async-calls/AR0000" | python -m json.tool
sleep 10
echo "******************************************************"
echo "Task status after finished"
echo "******************************************************"
echo
curl -sS "http://localhost:8085/rest/experimental-async-calls/AR0000" | python -m json.tool
******************************************************
Launch request
******************************************************
{
"error": null,
"id": "AR0000",
"partialResult": null,
"result": null,
"task": {
"cancelled": false,
"done": false,
"id": "T0004",
"name": "async-AR0000",
"runningDurationMs": 1,
"startTimeMs": 1568302605472,
"totalWork": null,
"workUnit": null,
"worked": 0
}
}
******************************************************
Task status after launched
******************************************************
{
"error": null,
"id": "AR0000",
"partialResult": null,
"result": null,
"task": {
"cancelled": false,
"done": false,
"id": "T0004",
"name": "async-AR0000",
"runningDurationMs": 54,
"startTimeMs": 1568302605472,
"totalWork": 7123,
"workUnit": null,
"worked": 0
}
}
******************************************************
Task status while running
******************************************************
{
"error": null,
"id": "AR0000",
"partialResult": null,
"result": null,
"task": {
"cancelled": false,
"done": false,
"id": "T0004",
"name": "async-AR0000",
"runningDurationMs": 2072,
"startTimeMs": 1568302605472,
"totalWork": 7123,
"workUnit": null,
"worked": 1000
}
}
******************************************************
Task status after finished
******************************************************
{
"error": null,
"id": "AR0000",
"partialResult": null,
"result": {
"query": "C1CCCCC1",
"querysmi": "C1CCCCC1",
"searchtime": 7637,
"targetcount": 7123,
"targets": [
{
"base64img": null,
"dissimilarity": 0.3333333333333333,
"targetid": "MOLECULE-3252",
"targetimageurl": "rest/molecules/m/3252/png-or-placeholder?w=0&h=0",
"targetindex": 3252,
"targetmolurl": "rest/molecules/m/3252"
}
]
},
"task": {
"cancelled": false,
"done": true,
"id": "T0004",
"name": "async-AR0000",
"runningDurationMs": 7665,
"startTimeMs": 1568302605472,
"totalWork": 7123,
"workUnit": null,
"worked": 7123
}
}
Security considerations
Asynchronous call statuses are visible to every REST API clients without authentication/authorization. By default IDs are assigned sequentially and tasks can be listed. When default settings used then the details of launched tasks (included query structures) are visible to all REST API clients.
See document REST API security considerations for additional details on the options to mitigate these risks (randomized task ID generation and disabling listing) using server feature flag options.