Storck Python Client#

Storck is made with REST API, which allows for swift communication with any machine or programming language that supports HTTP requests. To simplify the use of storck for end-users, we prepared a high-level python client.

Installation#

To be able to use the client, you need to install the Storck Client package. the package is available at the GitLab repository, separate from the main storck repo: https://gitlab.cern.ch/velo-calibration-software/storck_client

The installation is relatively easy; the best way is to use the client’s repository directly:

$ pip install git+https://gitlab.cern.ch/velo-calibration-software/storck_client

But PyPI also works

$ pip install storck-client

Preparing credentials#

The two essential credentials needed for communication with Storck are

  • User Token

  • Workspace Token

User Token is used as authentication key - so don’t ever share it with anyone. Workspace Token is used to authenticate the access to the workspace, and it can be shared To give access to the workspace.

Both can be set as environment variables (STORCK_USER_TOKEN and STORCK_WORKSPACE_TOKEN) using source file:

STORCK_USER_TOKEN="7c817d0d8f0e2b719a9df798fbdefe75cf5ba4be"
STORCK_WORKSPACE_TOKEN="73f19f87-f741-4f53-ae50-a9272dc87ea7"
STORCK_API_HOST="http://localhost:8000"

When the block above is saved to the env.sh file, they can be used by simply calling source env.sh. Additionally, the “STORCK_API_HOST” can be set to indicate the address of the storck service.

This is the preferred way of storing personal credentials to storck.

Usage#

After installing the package, you can use it by importing StorckClient

Storck Client Module#

In order to use the module, you’ll need to import StorckClient class, and create an instance of it. Then you can use all of the methods described in it’s documentation.

from storck_client import StorckClient
client = StorckClient()
client.upload_file(self.upload_dir + "/" + file_path, path=file_path)

Methods and Classes of Storck Client#

class storck_client.StorckClient(api_host: str = 'http://localhost:8000', user_token: str | None = None, workspace_token: str | None = None, storck_root_dir: str | None = None)[source]#

Bases: object

add_or_modify_metadata_schema(filetype: str, schema: dict | str)[source]#
add_user_to_workspace(user_id: int)[source]#

Adds users to workspace.

Parameters:

user_id – the id of the user to be added to the workspace.

auth_verify() dict[source]#

Check whether user exists in storck.

check_file(filepath: str, fhash: str) dict[source]#

Searches for a file under the filepath, with specific fhash.

Parameters:
  • filepath – A storck filepath

  • fhash – A file hash

Returns:

list of files matching the query

create_workspace(name: str) dict[source]#

Will create a workspace with given name.

download_file(file_id: int, target_path: str | Path, local_transfer=False)[source]#

Downloads the file to the target_path.

Parameters:
  • file_id – the unique file id in storck

  • target_path – the full path to where to solve the file, including the filename e.q. “/final/target/file.txt”

  • local_transfer – whether to use the local transfer (if the file is accesible by “cp” command), or to use the http request

get_file_content(file_id: int) bytes[source]#

Gets the content of the file.

Parameters:

file_id – Id of the file to downloaded.

get_info(file_id: int | None = None, path: str | None = None) dict[source]#

Gets detailed information about the file.

Parameters:
  • file_id – id of the file.

  • path – database path of the file

get_workspaces() dict[source]#

Gets the list of current workspaces

Returns:

dict of workspaces

list_metadata_schema()[source]#
search(search_dict: str | dict | None = None) dict[source]#

Searches for files. If name_contains will be provided, looks for a filename containig gie string. If search_dict is provided, will use it as the JSON encoded string query.

#this will return all of the files in the workspace
client.search()
#this will return all files under that path string in their path
client.search(search_dict={'stored_path':'/some/path/or/name/part')
#this will return all files containing the partial text of the path string in their path
client.search(search_dict={'stored_path__contains':'/some/path/or/name/part')
#this will return a file with id equal to 345
client.search(search_dict={"id":345})
#this will search for the file with the metada value ramp_speed equal to 5
client.search(search_dict={"metadata__ramp_speed":5})
#this will search for the file with the metada value ramp_speed greater or equal to 5
client.search(search_dict={"metadata__ramp_speed__gte":5})
Parameters:

search_dict – A stringified JSON containing relevant django query . The contents will be unpacked as python dict and fed to django’s filter() method This json will be unpacked to python dict, which will be unpacked as arguments of filter function in django. If you want to query the metada fields you have to name the keys starting with “metadata” with two underscores (se example) and then proceed with the jsonfield query .

Returns:

list of files matching the query

send_file_content(filename, path, data, query)[source]#
set_workspace_token(workspace_token: str)[source]#

Will override the current workspace_token, and also environment variable

upload_file(filename: str, path: str | None = None, metadata: str | None = None, file_hash: str | None = None, local_transfer=False) dict[source]#

Uploads the file to storck.

Parameters:
  • filename – Path to the file on the client side.

  • path – Optional database path to be used in storck. If not provided filename will be used instead.

  • metadata_str – a metadata json string

storck_client.md5sum_hash(fpath)[source]#

Scripts#

Along with the storck_client module come two scripts: storck_sync and storck_upload. The command python -m storck_sync -h, will output its usage:

usage:storck_sync.py [-h] [--host API_HOST] [--user-token USER_TOKEN]
                [--workspace-token WORKSPACE_TOKEN] [--dir AUTO_UPLOAD_DIR]

Automatically upload files from a given directory.

optional arguments:
-h, --help            show this help message and exit
--host API_HOST, -a API_HOST
                        STORCK api host
--user-token USER_TOKEN, -u USER_TOKEN
                        STORCK user token
--workspace-token WORKSPACE_TOKEN, -w WORKSPACE_TOKEN
                        STORCK workspace token
--dir AUTO_UPLOAD_DIR, -d AUTO_UPLOAD_DIR
                        auto upload directory

An examplary usage: python -m storck_sync path/to/dir. This script will automatically upload files to the storck. The token and host arguments can be set using environmental variables.

The second script storck_upload can be used to upload a single file along with its metadata:

usage: storck_upload.py [-h] [--storck_filepath STORCK_FILEPATH] [--metadata_str AUTO_UPLOAD_DIR] [--host API_HOST]
                        [--user-token USER_TOKEN] [--workspace-token WORKSPACE_TOKEN]
                        file

This script uploads a single file in to the storck, along with optional metadata. !!Warning for the future!! This script creates
just a single instance of the storck client connection, and destroys it after upload It might be more suitable in the future to
use a mechanism that will continously wait for new uploads.

positional arguments:
file                  file that wil lbe uploaded

optional arguments:
-h, --help            show this help message and exit
--storck_filepath STORCK_FILEPATH
                        if you want to store the ifle in storck under different path than provided
--metadata_str AUTO_UPLOAD_DIR
                        auto upload directory
--host API_HOST, -a API_HOST
                        STORCK api host
--user-token USER_TOKEN, -u USER_TOKEN
                        STORCK user token
--workspace-token WORKSPACE_TOKEN, -w WORKSPACE_TOKEN
                        STORCK workspace token

An examplary usage: python -m storck_upload path/to/file --metadata_str "metadata as str".

Crontab auto-upload#

Additionally, storck client repository contains an exemplary script that creates a cron jobs called regularly:

#!/bin/bash

STORCK_AUTO_UPLOAD_DIR=$STORCK_AUTO_UPLOAD_DIR
STORCK_API_HOST=$STORCK_API_HOST
STORCK_USER_TOKEN=$STORCK_USER_TOKEN
STORCK_WORKSPACE_TOKEN=$STORCK_WORKSPACE_TOKEN

SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"

apt-get update
apt-get install cron
pip install storck-client
echo "0 */2 * * * /usr/bin/python3 -m storck_sync -a ${STORCK_API_HOST} -u ${STORCK_USER_TOKEN} -w ${STORCK_WORKSPACE_TOKEN} -d ${STORCK_AUTO_UPLOAD_DIR} >> ~/storck_auto_upload.log 2>&1" | crontab

How to use#

Auto upload script checks every 2 hours if new files were dropped into the directory it is watching on. Script will upload files into the workspace set during the installation process (based on workspace token). We can choose which instance of STORCK we want to coop with by setting proper host value during installation.

To install the script we need to set the following environment variables:

  • STORCK_AUTO_UPLOAD_DIR - absolute path to the directory to watch

  • STORCK_API_HOST - STORCK api host

  • STORCK_USER_TOKEN - STORCK user token

  • STORCK_WORKSPACE_TOKEN - STORCK workspace token

Installation#

  1. Install storck client

  2. set environment variables listed above

  3. run installation script bash install.sh

Check running process#

You can check if process was correctly installed by running the command crontab -l You should get something similar to:

0 */2 * * * /usr/bin/python3 -m sstorck_sync -a $STORCK_API_HOST -u $STORCK_USER_TOKEN -w $STORCK_WORKSPACE_TOKEN -d $STORCK_AUTO_UPLOAD_DIR >> ~/storck_auto_upload.log 2>&1

Logging#

To check what files were uploaded by script or if it is working correctly, you can check the logs file ~/storck_auto_upload.log