Raw file handling
Please note that this is an experimental development direction with expected incompatible changes in future releases. Please contact us if you plan to use this extension library to discuss your use case.
Storing and retrieving raw content through the REST API is essential to integrate the MadFast REST server as a non-static, mutable component.
Main components of the raw file handling:
rest/expreimental-rawfiles
REST API endpoint with methods supportingPOST
,PUT
,GET
andDELETE
requests.rest/expreimental-rawfiles-content
Endpoint for accessing raw file contents with url semantics of HTTP.ExperimentalRawFileInfo
Resource info DTO for a single fileExperimentalRawFilesInfo
Resource info DTO for the resource classcom.chemaxon.overlap.res.ExperimentalRawFile
internal representation class
Creating raw files
Launch the MadFast server (from the distribution's root directory).
bin/gui.sh -port 8085
Create content to be posted:
echo "Hello, World" > hello.txt
echo "<html><head><title>Hello</title></head><body>Hello, World</body></html>" > hello.html
Use curl
to add these files by sending a multipart/form-data
POST
request to rest/experimental-rawfiles
: endpoint. The response ExperimentalRawFileInfo
is the resource descriptor in JSON for the freshly created resource.
curl -X POST -F file=@hello.txt http://localhost:8085/rest/experimental-rawfiles | python -m json.tool
{
"contenttype": "application/octet-stream",
"description": "Uploaded from file hello.txt",
"name": "hello.txt",
"size": 13,
"time": 0,
"url": "rest/experimental-rawfiles/hello.txt"
}
The resource is created with the default application/octet-stream
content type.
Accessing rawfiles
We can access rawfile resources with curl
from endpoint rest/experimental-rawfiles/{res}/raw
:
curl -i http://localhost:8085/rest/experimental-rawfiles/hello.txt/raw
HTTP/1.1 200 OK
Date: Thu, 16 May 2019 20:38:20 GMT
Content-Type: application/octet-stream
content-disposition: attachment; filename = hello.txt
Content-Length: 13
Server: Jetty(9.4.15.v20190215)
Hello, World
Option -i
passed to curl
will print response HTTP headers as part of the output. Note the presence of the content-disposition
header.
Skip content-disposition
header
With endpoint rest/experimental-rawfiles/{res}/raw-nocd
the content-disposition
HTTP header will not be attached:
curl -i http://localhost:8085/rest/experimental-rawfiles/hello.txt/raw-nocd
HTTP/1.1 200 OK
Date: Thu, 16 May 2019 20:45:09 GMT
Content-Type: application/octet-stream
Content-Length: 13
Server: Jetty(9.4.15.v20190215)
Hello, World
An equivalent endpoint rest/experimental-rawfiles-content/{res}
is also available to access the rawfile contents with an URL structure more conform to HTTP conventions:
curl -i http://localhost:8085/rest/experimental-rawfiles-content/hello.txt
HTTP/1.1 200 OK
Date: Thu, 16 May 2019 20:45:44 GMT
Content-Type: application/octet-stream
Content-Length: 13
Server: Jetty(9.4.15.v20190215)
Hello, World
Accessing from browser
Or from the browser on URLs:
-
With
content-disposition
header: http://localhost:8085/rest/experimental-rawfiles/hello.txt/raw. -
Without
content-disposition
header: http://localhost:8085/rest/experimental-rawfiles/hello.txt/raw-nocd. -
Without
content-disposition
header, alternative URL structure: http://localhost:8085/rest/experimental-rawfiles-content/hello.txt.
Note that when content-disposition
header is set the browser will save the content as a file. Without this header the browser will try to render it in-place. In this case (since the content-type
is the default application/octet-stream
) the browser typically is not able to render the content.
Specifying details
We can specify the content type and description too (see documentation of rest/experimental-rawfiles
endpoint POST
request):
curl -X POST \
-F contenttype=text/plain \
-F "description=This is a text file" \
-F file=@hello.txt \
http://localhost:8085/rest/experimental-rawfiles | python -m json.tool
{
"contenttype": "text/plain",
"description": "This is a text file",
"name": "hello.txt-1",
"size": 13,
"time": 0,
"url": "rest/experimental-rawfiles/hello.txt-1"
}
The resource is created with name hello.txt-1
. If we request the file with no content-disposition
header we can now view it from the browser on URL http://localhost:8085/rest/experimental-rawfiles/hello.txt-1/raw-nocd or on URL http://localhost:8085/rest/experimental-rawfiles-content/hello.txt-1.
See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition for further details on the content-disposition
header.
Modifying raw files
With a PUT
request sent to rest/experimental-rawfiles/{res}
(where {res}
is the raw file resource name to be modified) we can explicitly specify the resource name by overwriting previous files:
curl -X PUT \
-F contenttype=text/plain \
-F "description=Text file with proper (text/plain) content type" \
-F file=@hello.txt \
http://localhost:8085/rest/experimental-rawfiles/hello.txt | python -m json.tool
{
"contenttype": "text/plain",
"description": "Text file with proper (text/plain) content type",
"name": "hello.txt",
"size": 13,
"time": 0,
"url": "rest/experimental-rawfiles/hello.txt"
}
Deleting raw files
A DELETE
request sent to /experimental/rawfiles/{res}
removes the raw file:
curl -i -X DELETE http://localhost:8085/rest/experimental-rawfiles/hello.txt
HTTP/1.1 204 No Content
Date: Thu, 16 May 2019 20:49:57 GMT
Server: Jetty(9.4.15.v20190215)
As we see the DELETE
request returned with HTTP
status 204 No Content
and no further content.
Choosing proper content type
Uploading the HTML file with different content types:
# Content to be uploaded
echo "<html><head><title>Hello</title></head><body>Hello, World</body></html>" > hello.html
# Use improper (text/plain) content type
curl -X PUT \
-F contenttype=text/plain \
-F "description=HTML file with improper (text/plain) content type" \
-F file=@hello.html \
http://localhost:8085/rest/experimental-rawfiles/hello.html-as-text | python -m json.tool
# Use proper (text/html) content type
curl -X PUT \
-F contenttype=text/html \
-F "description=HTML file with proper (text/html) content type" \
-F file=@hello.html \
http://localhost:8085/rest/experimental-rawfiles/hello.html-as-html | python -m json.tool
{
"contenttype": "text/plain",
"description": "HTML file with improper (text/plain) content type",
"name": "hello.html-as-text",
"size": 72,
"time": 0,
"url": "rest/experimental-rawfiles/hello.html-as-text"
}
{
"contenttype": "text/html",
"description": "HTML file with proper (text/html) content type",
"name": "hello.html-as-html",
"size": 72,
"time": 0,
"url": "rest/experimental-rawfiles/hello.html-as-html"
}
We can see the difference when opened from a browser
-
Proper content type http://localhost:8085/rest/experimental-rawfiles-content/hello.html-as-html displayed as a html file.
-
Improper content type http://localhost:8085/rest/experimental-rawfiles-content/hello.html-as-text rendered as text (html source).
View raw files on WebUI
Raw files are presented on the default index page of the WebUI <http:/localhost:8085/>. Note that the following screenshots are made before deleting hello.txt
:
Contents of individual files can be displayed:
And metadata:
Reading raw files on server startup
It is possible to read raw files on server startup with command line option -rawfile <SPEC>
:
# Write a file
echo "<html><head><title>Hello</title></head><body>Hello, World</body></html>" > hello.html
# Launch server, read the written file with multiple options
bin/gui.sh -port 8085 \
-rawfile -file:hello.html:-name:read_with_default_opts \
-rawfile "-file:hello.html:-name:read_as_text:-contenttype:text/plain:-description:File read with text/plain content type" \
-rawfile "-file:hello.html:-name:read_as_html:-contenttype:text/html:-description:File read with text/html content type"
Please note that command line option -additionalresourcedir <DIRECTORY>
also provides a mechanism to serve additional static files. See document REST API / Web UI for similarity searches for details.
Security considerations
The ability to upload or modify arbitrary raw content over the REST API could introduce security risks in certain deployments. See document REST API security considerations for additional details on the options to mitigate these risks using server feature flag options.