Sometimes few
things in technology makes more sense when they work end to end rather than working
as an individual piece. So that we can come to know how things are stitched together.
I am going
to talk about one small project I have created recently. What is does, it scans
a file from a directory with one of the 3rd party antivirus API (OPSWAT).
After scanning the file, if file is not infected then it stores in a different
directory (success file blob) or if file is corrupted then file is move to an
infected file directory so that in doesn’t infect another existing file.
Prerequisite:
Create
a free/paid account on OPSWAT:
Once you
create an account on OPSWAT, you will receive an API key to call antivirus API.
I will be
using 2 API for scanning:
i)
Scanning
a file by file upload https://onlinehelp.opswat.com/mdcloud/2.1_Scanning_a_file_by_file_upload.html
ii)
Retrieving
scan reports using data ID https://onlinehelp.opswat.com/mdcloud/2.2_Retrieving_scan_reports_using_data_ID.html
This is the
2-step process call to get the scan report. In 1st step, a file will
be uploaded and OPSWAT will return an asynchronous response with data id (file
upload and scanning are in progress). In 2nd call we can fetch the scanning
report by the data id which is returned in 1st call.
1st call example:
Request:
curl -X POST https://api.metadefender.com/v4/file \
-H 'apikey: ${APIKEY}' \
-H 'content-type: application/octet-stream' \
-d @/path/to/data.file
Response:
{
"data_id": "bzIwMDExN1NreHhiOG9RSmJVU2tXeFdJc1FKWjg",
"status": "inqueue",
"in_queue": 1,
"queue_priority": "normal",
"sha1": "068AE4D07A7F4FE2BF955CBA0FD05AB0A5A8A6FE",
"sha256": "67C6BCEE6FFCEFA887E415CBF0247C2788696169B58EB66319F558DDB6822D9D"
}
2nd
call example:
Request:
curl -X GET \
https://api.metadefender.com/v4/file/ZTE2MTIyNkhKeGs5WElSNHhIMVFGLVlUYk85LP \
-H "apikey: ${APIKEY}" \
Response:
{
"scan_result_history_length": 8,
"file_id": "bzE5MDIxMkJ5RXEtWmdIRQ",
"data_id": "bzE5MDIxMkJ5RXEtWmdIRXJrR2VlNmhNWUg0",
"sanitized": {
"result": "Allowed",
"progress_percentage": 100,
"data_id": "ZDE5MDIyMEJ5RXEtWmdIRS5zYW5pdGl6ZWRCeWx1a2Fic1NF",
"reason": ""
},
"process_info": {
"result": "Blocked",
"profile": "Sanitize",
"post_processing": {
"copy_move_destination": "",
"converted_to": "xls",
"converted_destination": "Ft._immediata_group_7893_2019_02_sanitized_by_OPSWAT_MetaDefender_779e0a0966f348fcaecdacc4f6c47e16.xls",
"actions_ran": "Sanitized",
--
--
--
},
"rescan_available": true,
"scan_all_result_i": 1,
"start_time": "2019-02-20T17:22:27.058Z",
"total_time": 1166,
"total_avs": 37,
"total_detected_avs": 17,
"progress_percentage": 100,
"scan_all_result_a": "Infected"
--
--
"share_file": 1,
"rest_version": "4",
"additional_info": [],
"votes": {
"up": 0,
"down": 0
}
}
Create
a free/paid account on Azure portal
You can create a free account or pay
as you go if you don’t have any from https://portal.azure.com/
Integration
Solution Resources:
To achieve
this solution, we need to create below resources:
· A storage account with the 3 blob
containers
o
toscan
o
scanned-success-files
o
infected-files
· A Keyvault – to store OPSWAT API Key
· A Service bus namespance with a queue
Integration Solution:
Let’s follow
step by step with above diagram to understand how things are connected and
working.
1)
Put
any file to scan on toscan container
2)
As
soon as file is uploaded on toscan container, it will trigger “scanfile” logic
app (looks up add/update blob in every 10 sec) to process file for scanning
3)
scanfile
logic app will get the OPSWAT API key from keyvault and make a http call to OPSWAT
API for uploading and scanning blob file, which we placed for scanning
4)
OPSWAT
API will receive the request and send a asynchronous response
5)
As
soon as we receive response from OPSWAT with dataid, same response message will
be pushed to Service Bus queue to process further
6)
Once
message is received in Service Bus queue, it will trigger another Logic app (looks
p message in every 10 sec) called “getfilescanreport”
7)
This
getfilescanreport logic app make another 2nd call to OPSWAT to
receive scan report with the dataid, received in 1st OPSWAT API
call.
8)
OPSWAT
will receive 2nd request and provide the scan report in http
response
9)
Getfilescanreport
logic app will assess the response and make decision to store file in relevant
folder
10)
If
file is infected, then it will store the file in infected-files folder
else scanned-succes-files and in last step it will delete the file from “toscan”
folder as the process is completed
Mostly thing in above steps are very
straight forward but let’s investigate the logic app where I have made
connection and written business logic.
No comments:
Post a Comment