Add development updates.

Including:
1. Generated python documentat in docs/.
2. Starting a new python client in client/.
3. Moving testing data to test/.
4. The addition of Cypress UI tests and pytest tests in test/.
5. A number of bug fixes and improvements.
This commit is contained in:
Laura Glendenning
2020-07-30 11:49:53 -04:00
parent f93826e252
commit d6aa00330d
432 changed files with 1246355 additions and 1050198 deletions

32
.dockerignore Normal file
View File

@@ -0,0 +1,32 @@
# Git
**/.git/
# Docker
**/docker-compose*
/setup_docker_test_data.sh
# Node
**/node_modules/
/frontend/annotation/dist/
# Eclipse
**/.project
**/.pydevproject
**/.settings/
# Python
**/__pycache__
# Database
/eve/db/
# Logs
backend/pine/backend/logs
pipelines/JavaNER/pmap_api/logs
# Misc
/local_data/
/instance/
/redis/data
**/Dockerfile
/results/

31
.env Normal file
View File

@@ -0,0 +1,31 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
REDIS_PORT=6379
EVE_PORT=7510
BACKEND_PORT=7520
MONGO_PORT=27018
EVE_DB_VOLUME=eve_db
OPENNLP_ID=5babb6ee4eb7dd2c39b9671c
CORENLP_ID=5babb6ee4eb7dd2c39b9671d
DOCUMENT_CLASSIFIER_ID=5babb6ee4eb7dd2c39b9671b
EXPOSED_SERVER_TYPE=https
EXPOSED_SERVER_NAME=localhost
EXPOSED_PORT=8888
EXPOSED_SERVER_TYPE_PROD=http
EXPOSED_SERVER_NAME_PROD=annotation
EXPOSED_PORT_PROD=80
AUTH_MODULE=vegas
#MONGO_URI=
#VEGAS_CLIENT_SECRET=
# Change these to be volume names instead of paths if you want to use docker volumes
# If SHARED_VOLUME is a docker volume, be sure it is populated with the contents of ./shared
SHARED_VOLUME=./shared
MODELS_VOLUME=./local_data/models
LOGS_VOLUME=./local_data/logs
DOCUMENT_IMAGE_VOLUME=./local_data/test_document_images

62
.gitignore vendored Normal file
View File

@@ -0,0 +1,62 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# Node
**/node_modules/
# IDEs
**/.project
**/.pydevproject
**/.settings/
**/.idea/
# Python
**/__pycache__
# Logs
backend/pine/backend/logs
pipelines/pine/pipelines/logs
# Source https://github.com/helm/charts/blob/c194bce22cf6eae521bdd79d12ee04d9b1cd3e50/.gitignore
# General files for the project
pkg/*
*.pyc
bin/*
.project
/.bin
/_test/secrets/*.json
# OSX leaves these everywhere on SMB shares
._*
# OSX trash
.DS_Store
# Files generated by JetBrains IDEs, e.g. IntelliJ IDEA
.idea/
*.iml
# Vscode files
.vscode
# Emacs save files
*~
\#*\#
.\#*
# Vim-related files
[._]*.s[a-w][a-z]
[._]s[a-w][a-z]
*.un~
Session.vim
.netrwhist
# Chart dependencies
**/charts/*.tgz
.history
# Pipelines and local data
/pipelines/models/
/local_data/*
!/local_data/dev/test_images/static/
/results/

203
README.md
View File

@@ -1,49 +1,65 @@
# PINE (Pmap Interface for Nlp Experimentation)
██████╗ ██╗███╗ ██╗███████╗
██╔══██╗██║████╗ ██║██╔════╝
██████╔╝██║██╔██╗ ██║█████╗
██╔═══╝ ██║██║╚██╗██║██╔══╝
██║ ██║██║ ╚████║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝╚══════╝
Pmap Interface for Nlp Experimentation
© 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
## About PINE
PINE is a web-based tool for text annotation. It enables annotation at the document level as well as over text spans (words). The annotation facilitates generation of natural language processing (NLP) models to classify documents and perform named entity recognition. Some of the features include:
PINE is a web-based tool for text annotation. It enables annotation at the document level as well
as over text spans (words). The annotation facilitates generation of natural language processing
(NLP) models to classify documents and perform named entity recognition. Some of the features
include:
* Generate models in Spacy, OpenNLP, or CoreNLP on the fly and rank documents using Active Learning to reduce annotation time.
* Generate models in Spacy, OpenNLP, or CoreNLP on the fly and rank documents using Active Learning
to reduce annotation time.
* Extensible framework - add NLP pipelines of your choice.
* Active Learning support - Out of the box active learning support (https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) with pluggable active learning methods ranking functions.
* Active Learning support - Out of the box active learning support
(https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) with pluggable active learning
methods ranking functions.
* Facilitates group annotation projects - view other peoples annotations, calculates inter-annotator agreement, displays annotation performance.
* Facilitates group annotation projects - view other peoples annotations, calculates
inter-annotator agreement, displays annotation performance.
* Enterprise authentication - integrate with your existing OAuth/Active Directory Servers.
* Scalability - deploy in docker compose or a kubernetes cluster; ability to use database as a service such as CosmosDB.
* Scalability - deploy in docker compose or a kubernetes cluster; ability to use database as a
service such as CosmosDB.
PINE was developed under internal research and development (IRAD) funding at the [Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu/). It was created to support the annotation needs of NLP tasks on the [precision medicine analytics platform (PMAP)](https://pm.jh.edu/) at Johns Hopkins.
PINE was developed under internal research and development (IRAD) funding at the
[Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu/). It was created to
support the annotation needs of NLP tasks on the
[precision medicine analytics platform (PMAP)](https://pm.jh.edu/) at Johns Hopkins.
## Required Resources
Note - download required resources and place in pipelines/pine/pipelines/resources
Note - download required resources and place in `pipelines/pine/pipelines/resources`:
* apache-opennlp-1.9.0
* stanford-corenlp-full-2018-02-27
* stanford-ner-2018-02-27
These are required to build docker images for active learning
Alternatively, you can use the provided convenience script:
`./pipelines/download_resources.sh`
## Configuring Logging
These are required to build docker images for active learning.
See logging configuration files in `./shared/`. `logging.python.dev.json` is used with the
dev stack; the other files are used in the docker containers.
The docker-compose stack is currently set to bind the `./shared/` directory into the containers
at run-time. This allows for configuration changes of the logging without needing to rebuild
containers, and also allows the python logging config to live in one place instead of spread out
into each container. This is controlled with the `${SHARED_VOLUME}` variable from `.env`.
Log files will be stored in the `${LOGS_VOLUME}` variable from `.env`. Pipeline models files will
be stored in the `${MODELS_VOLUME}` variable from `./env`.
----------------------------------------------------------------------------------------------------
## Development Environment
First, refer to the various README files in the subproject directories for dependencies.
Install the pipenv in pipelines.
Alternatively, a convenience script is provided:
```bash
./setup_dev_stack.sh
```
Then a dev stack can be run with:
```bash
@@ -55,7 +71,7 @@ planning to use that auth module.
The dev stack can be stopped with Ctrl-C.
Sometimes (for me) mongod doesn't start in time or something. If you see a connection
Sometimes mongod doesn't seem to start in time. If you see a connection
error for mongod, just close it and try it again.
Once the dev stack is up and running, the following ports are accessible:
@@ -63,6 +79,53 @@ Once the dev stack is up and running, the following ports are accessible:
* `localhost:5000` hosts the backend
* `localhost:5001` hosts the eve layer
### Generating documentation
1. See `docs/README.md` for information on required environment.
2. `./generate_documentation.sh`
3. Generated documentation can then be found in `./docs/build`.
### Testing Data
To import testing data, run the dev stack and then run:
```bash
./setup_dev_test_data.sh
```
*WARNING*: This script will remove any pre-existing data. If you need to clear your database
for other reasons, stop your dev stack and then `rm -rf local_data/dev/eve/db`.
### Testing
There are test cases written using Cypress; for more information, see `test/README.md`.
The short version, to run the tests using the docker-compose stack:
1. `test/build.sh`
2. `test/run_docker_compose.sh --report`
3. Check `./results/<timestamp>` (the script in the previous step will print out the exact path) for:
* `reports/report.html`: an HTML report of tests run and their status
* `screenshots/`: for any screenshots from failed tests
* `videos/`: for videos of all the tests that were run
To use the interactive dashboard:
1. `test/build.sh`
2. `test/run_docker_compose.sh --dashboard`
It is also possible to run the cypress container directly, or locally with the dev stack. For more
information, see `test/README.md`.
### Versioning
There are three versions being tracked:
* overall version: environment variable PINE_VERSION based on the git tag/revision information (see `./version.sh`)
* eve/database version: controlled in `eve/python/settings.py`
* frontend version: controlled in `frontend/annotation/package.json`
The eve/database version should be bumped up when the schema changes. This will (eventually) be
used to implement data migration.
The frontend version is the least important.
### Using the copyright checking pre-commit hook
The script `pre-commit` is provided as a helpful utility to make sure that new files checked into
@@ -74,27 +137,23 @@ installed manually:
This hook greps for the copyright text in new files and gives you the option to abort if it is
not found.
### Clearing the database
First, stop your dev stack. Then `rm -rf eve/db` and start the stack again.
### Importing test data
Once running, test data can be imported with:
```bash
./setup_dev_data.sh
```
### Updating existing data
If there is existing data in the database, it is possible that it needs to be
migrated. To do this, run the following once the system is up and running:
```bash
cd eve/python && python3 update_documents_annnotation_status.py
```
----------------------------------------------------------------------------------------------------
## Docker Environments
*IMPORTANT*:
For all the docker-compose environments, it is required to set a `PINE_VERSION` environment
variable. To do this, either prepend each docker-compose command:
```bash
PINE_VERSION=$(./version.sh) docker-compose ...
```
Or export it in your shell:
```bash
export PINE_VERSION=$(./version.sh)
docker-compose ...
```
The docker environment is run using docker-compose. There are two supported configurations: the
default and the prod configuration.
@@ -105,25 +164,23 @@ To build the images for DEFAULT configuration:
```bash
docker-compose build
```
Or use the convenience script:
```bash
./run_docker_compose.sh --build
```
To run containers as daemons for DEFAULT configuration (remove -d flag to see logs):
```bash
docker-compose up -d
docker-compose up
```
You may also want the `--abort-on-container-exit` flag which will make errors more apparent.
With default settings, the webapp will now be accessible at https://localhost:8888
To watch logs for DEFAULT configuration:
Or use the convenience script:
```bash
docker-compose logs -f
./run_docker_compose.sh --up
```
To bring containers down for DEFAULT configuration:
```bash
docker-compose down
```
With default settings, the webapp will now be accessible at `https://localhost:8888`
### Production Docker Environment
@@ -135,24 +192,31 @@ docker-compose -f docker-compose.yml -f docker-compose.prod.yml build
Note that you probably need to update `.env` and add the `MONGO_URI` property.
### Clearing the database
### Test data
Bring all the containers down. Then do a `docker ps --all` and find the numeric ID of the eve
container and remove it with `docker rm <id>`. Then, remove the two eve volumes with
`docker volume rm nlp_webapp_eve_db` and `docker volume rm nlp_webapp_eve_logs`. Finally, restart
your containers.
### Importing test data
To import test data, you need to run the docker-compose stack using the docker-compose.test.yml file:
```bash
docker-compose build
docker-compose -f docker-compose.yml -f docker-compose.override.yml -f docker-compose.test.yml up
```
Or use the convenience script:
```bash
./run_docker_compose.sh --build
./run_docker_compose.sh --up-test
```
Once the system is up and running:
```bash
./setup_docker_test_data.sh
```
### Updating existing data
Once the test data has been imported, you no longer need to use the docker-compose.test.yml file.
If there is existing data in the database, it is possible that it needs to be
migrated. To do this, run the following once the system is up and running:
If you need to clear the database, bring down the container and remove the `nlp_webapp_eve_db` and
`nlp_webapp_eve_logs` volumes with `docker volume rm`.
If you are migrating from very old PINE versions, it is possible that you need to migrate your
data if you are seeing applications errors:
```bash
docker-compose exec eve python3 python/update_documents_annnotation_status.py
```
@@ -184,6 +248,23 @@ docker-compose exec backend scripts/data/set_user_password.sh <email username> <
Alternatively, there is an Admin Dashboard through the web interface.
----------------------------------------------------------------------------------------------------
## Misc Configuration
### Configuring Logging
See logging configuration files in `./shared/`. `logging.python.dev.json` is used with the
dev stack; the other files are used in the docker containers.
The docker-compose stack is currently set to bind the `./shared/` directory into the containers
at run-time. This allows for configuration changes of the logging without needing to rebuild
containers, and also allows the python logging config to live in one place instead of spread out
into each container. This is controlled with the `${SHARED_VOLUME}` variable from `.env`.
Log files will be stored in the `${LOGS_VOLUME}` variable from `.env`. Pipeline models files will
be stored in the `${MODELS_VOLUME}` variable from `./env`.
### Collection/Document Images
It is now possible to explore images in the "annotate document" page in the frontend UI. The image

View File

@@ -0,0 +1,4 @@
&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
* [![Prod Build Status](https://dev.azure.com/JH-PMAP/APPLICATIONS/_apis/build/status/oa-nlp_annotator/oa-nlp_annotator%20CI?branchName=master)](https://dev.azure.com/JH-PMAP/APPLICATIONS/_build/latest?definitionId=5&branchName=master)
* [![Dev Build Status](https://dev.azure.com/JH-PMAP/APPLICATIONS/_apis/build/status/oa-nlp_annotator/oa-nlp_annotator%20CI?branchName=develop)](https://dev.azure.com/JH-PMAP/APPLICATIONS/_build/latest?definitionId=5&branchName=develop)

View File

@@ -5,6 +5,8 @@ parameters:
appUrl: ""
azureContainerRegistry: $(azureContainerRegistry)
azureSubscriptionEndpointForSecrets: $(azureSubscriptionEndpointForSecrets)
backendStorageMountPath: "/mnt/azure"
backendStorageShareName: ""
deployEnvironment: $(deployEnvironment)
deploymentName: "CONTAINER_DEPLOY"
helmChart: "pine-chart"
@@ -94,6 +96,10 @@ jobs:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.backendImageName }}
tag: ${{ parameters.imageTag }}
persistence:
enabled: true
shareName: ${{ parameters.backendStorageShareName }}
mountPath: ${{ parameters.backendStorageMountPath }}
nlpAnnotation:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.pipelineImageName }}

View File

@@ -117,11 +117,15 @@ stages:
backendImageName: $(backendImageName)
frontendImageName: $(frontendImageName)
pipelineImageName: $(pipelineImageName)
backendStorageShareName: "pine-files-dev"
secrets:
backend:
VEGAS_CLIENT_SECRET: $(vegas-client-secret-dev)
eve:
MONGO_URI: $(mongo-uri-dev)
azure-secret:
azurestorageaccountname: $(azure-storage-account-name-dev)
azurestorageaccountkey: $(azure-storage-account-key-dev)
- stage: deploy_to_prod
displayName: Deploy to prod
condition: and(succeeded(), eq(variables['build.sourceBranch'], 'refs/heads/master'))
@@ -140,8 +144,12 @@ stages:
backendImageName: $(backendImageName)
frontendImageName: $(frontendImageName)
pipelineImageName: $(pipelineImageName)
backendStorageShareName: "pine-files-prod"
secrets:
backend:
VEGAS_CLIENT_SECRET: $(vegas-client-secret-prod)
eve:
MONGO_URI: $(mongo-uri-prod)
azure-secret:
azurestorageaccountname: $(azure-storage-account-name-prod)
azurestorageaccountkey: $(azure-storage-account-key-prod)

View File

@@ -21,9 +21,6 @@ First-time setup:
Running the server:
* `./dev_run.sh`
Once test data has been set up in the eve layer, the script `setup_dev_data.sh`
can be used to set up data from the backend's perspective.
## Setup
Before running, you must edit ../.env and set `VEGAS_CLIENT_SECRET` appropriately

View File

@@ -4,7 +4,7 @@
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
CONFIG_FILE="${DIR}/pine/backend/config.py"
if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
if ([[ -z ${AUTH_MODULE} ]] || [[ ${AUTH_MODULE} == "vegas" ]]) && [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
echo ""
echo ""
echo ""
@@ -19,4 +19,4 @@ fi
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask run
pipenv run flask run $@

View File

@@ -5,6 +5,8 @@ import json
bind = "0.0.0.0:${PORT}"
workers = ${WORKERS}
accesslog = "-"
timeout = 60
limit_request_line = 0
if "PINE_LOGGING_CONFIG_FILE" in os.environ and os.path.isfile(os.environ["PINE_LOGGING_CONFIG_FILE"]):
with open(os.environ["PINE_LOGGING_CONFIG_FILE"], "r") as f:

View File

@@ -3,7 +3,7 @@
GUNICORN_CONFIG_FILE="config.py"
if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
if ([[ -z ${AUTH_MODULE} ]] || [[ ${AUTH_MODULE} == "vegas" ]]) && [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
echo ""
echo ""
echo ""

View File

@@ -1,3 +1,5 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from .app import create_app
from .app import create_app, VERSION
__version__ = VERSION

View File

@@ -6,7 +6,7 @@ import logging
from flask import abort, Blueprint, jsonify, request
from werkzeug import exceptions
from .. import auth, log
from .. import auth, collections, log
from ..data import service
from ..documents import bp as documents
@@ -19,7 +19,7 @@ CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS = "allow_overlapping_ner_annotations"
bp = Blueprint("annotations", __name__, url_prefix = "/annotations")
def check_document(doc_id):
def check_document_by_id(doc_id: str):
"""
Verify that a document with the given doc_id exists and that the logged in user has permissions to access the
document
@@ -29,6 +29,10 @@ def check_document(doc_id):
if not documents.user_can_view_by_id(doc_id):
raise exceptions.Unauthorized()
def check_document(doc: dict):
if not documents.user_can_view(doc):
raise exceptions.Unauthorized()
@bp.route("/mine/by_document_id/<doc_id>")
@auth.login_required
def get_my_annotations_for_document(doc_id):
@@ -38,7 +42,7 @@ def get_my_annotations_for_document(doc_id):
:param doc_id: str
:return: Response
"""
check_document(doc_id)
check_document_by_id(doc_id)
where = {
"document_id": doc_id,
"creator_id": auth.get_logged_in_user()["id"]
@@ -57,7 +61,7 @@ def get_others_annotations_for_document(doc_id):
:param doc_id: str
:return: str
"""
check_document(doc_id)
check_document_by_id(doc_id)
where = {
"document_id": doc_id,
# $eq doesn't work here for some reason -- maybe because objectid?
@@ -77,7 +81,7 @@ def get_annotations_for_document(doc_id):
:param doc_id: str
:return: str
"""
check_document(doc_id)
check_document_by_id(doc_id)
where = {
"document_id": doc_id
}
@@ -103,22 +107,15 @@ def get_current_annotation(doc_id, user_id):
else:
return None
def is_ner_annotation(ann):
"""
Verify that the provided annotation is in the valid format for an NER Annotation
:param ann: Any
:return: Bool
"""
return (type(ann) is list or type(ann) is tuple) and len(ann) == 3
def check_overlapping_annotations(document, ner_annotations):
ner_annotations.sort(key = lambda x: x[0])
resp = service.get("collections/" + document["collection_id"])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
# def is_ner_annotation(ann):
# """
# Verify that the provided annotation is in the valid format for an NER Annotation
# :param ann: Any
# :return: Bool
# """
# return (type(ann) is list or type(ann) is tuple) and len(ann) == 3
def check_overlapping_annotations(collection, ner_annotations):
# if allow_overlapping_ner_annotations is false, check them
if "configuration" in collection and CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS in collection["configuration"] and not collection["configuration"][CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS]:
for idx, val in enumerate(ner_annotations):
@@ -127,113 +124,114 @@ def check_overlapping_annotations(document, ner_annotations):
if val[0] < prev[1]:
raise exceptions.BadRequest("Collection is configured not to allow overlapping annotations")
@bp.route("/mine/by_document_id/<doc_id>/ner", methods = ["POST", "PUT"])
@auth.login_required
def save_ner_annotations(doc_id):
"""
Save new NER annotations to the database as an entry for the logged in user, for the document. If there are already
annotations, use a patch request to update with the new annotations. If there are not, use a post request to create
a new entry.
:param doc_id: str
:return: str
"""
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
"collection_id": 1,
"metadata": 1
})
})
annotations = request.get_json()
user_id = auth.get_logged_in_user()["id"]
annotations = [(ann["start"], ann["end"], ann["label"]) for ann in annotations]
check_overlapping_annotations(document, annotations)
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": annotations
}
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == annotations:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
# add all the other non-ner labels
for annotation in current_annotation["annotation"]:
if not is_ner_annotation(annotation):
new_annotation["annotation"].append(annotation)
# @bp.route("/mine/by_document_id/<doc_id>/ner", methods = ["POST", "PUT"])
# @auth.login_required
# def save_ner_annotations(doc_id):
# """
# Save new NER annotations to the database as an entry for the logged in user, for the document. If there are already
# annotations, use a patch request to update with the new annotations. If there are not, use a post request to create
# a new entry.
# :param doc_id: str
# :return: str
# """
# if not request.is_json:
# raise exceptions.BadRequest()
# check_document_by_id(doc_id)
# document = service.get_item_by_id("documents", doc_id, {
# "projection": json.dumps({
# "collection_id": 1,
# "metadata": 1
# })
# })
# annotations = request.get_json()
# user_id = auth.get_logged_in_user()["id"]
# annotations = [(ann["start"], ann["end"], ann["label"]) for ann in annotations]
# check_overlapping_annotations(document, annotations)
# new_annotation = {
# "creator_id": user_id,
# "collection_id": document["collection_id"],
# "document_id": doc_id,
# "annotation": annotations
# }
#
# current_annotation = get_current_annotation(doc_id, user_id)
# if current_annotation != None:
# if current_annotation["annotation"] == annotations:
# return jsonify(True)
# headers = {"If-Match": current_annotation["_etag"]}
#
# # add all the other non-ner labels
# for annotation in current_annotation["annotation"]:
# if not is_ner_annotation(annotation):
# new_annotation["annotation"].append(annotation)
#
# resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
# else:
# resp = service.post("annotations", json = new_annotation)
#
# if resp.ok:
# new_annotation["_id"] = resp.json()["_id"]
# log.access_flask_annotate_document(document, new_annotation)
#
# return jsonify(resp.ok)
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
resp = service.post("annotations", json = new_annotation)
if resp.ok:
new_annotation["_id"] = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
# def is_doc_annotation(ann):
# """
# Verify that an annotation has the correct format (string)
# :param ann: Any
# :return: Bool
# """
# return isinstance(ann, str)
def is_doc_annotation(ann):
"""
Verify that an annotation has the correct format (string)
:param ann: Any
:return: Bool
"""
return isinstance(ann, str)
@bp.route("/mine/by_document_id/<doc_id>/doc", methods = ["POST", "PUT"])
@auth.login_required
def save_doc_labels(doc_id):
"""
Save new labels to the database as an entry for the logged in user, for the document. If there are already
annotations/labels, use a patch request to update with the new labels. If there are not, use a post request to
create a new entry.
:param doc_id:
:return:
"""
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
"collection_id": 1,
"metadata": 1
})
})
labels = request.get_json()
user_id = auth.get_logged_in_user()["id"]
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": labels
}
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == labels:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
# add all the other non-doc labels
for annotation in current_annotation["annotation"]:
if not is_doc_annotation(annotation):
new_annotation["annotation"].append(annotation)
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
resp = service.post("annotations", json = new_annotation)
if resp.ok:
new_annotation = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
# @bp.route("/mine/by_document_id/<doc_id>/doc", methods = ["POST", "PUT"])
# @auth.login_required
# def save_doc_labels(doc_id):
# """
# Save new labels to the database as an entry for the logged in user, for the document. If there are already
# annotations/labels, use a patch request to update with the new labels. If there are not, use a post request to
# create a new entry.
# :param doc_id:
# :return:
# """
# if not request.is_json:
# raise exceptions.BadRequest()
# check_document_by_id(doc_id)
# document = service.get_item_by_id("documents", doc_id, {
# "projection": json.dumps({
# "collection_id": 1,
# "metadata": 1
# })
# })
#
# labels = request.get_json()
# user_id = auth.get_logged_in_user()["id"]
# new_annotation = {
# "creator_id": user_id,
# "collection_id": document["collection_id"],
# "document_id": doc_id,
# "annotation": labels
# }
#
# current_annotation = get_current_annotation(doc_id, user_id)
# if current_annotation != None:
# if current_annotation["annotation"] == labels:
# return jsonify(True)
# headers = {"If-Match": current_annotation["_etag"]}
#
# # add all the other non-doc labels
# for annotation in current_annotation["annotation"]:
# if not is_doc_annotation(annotation):
# new_annotation["annotation"].append(annotation)
#
# resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
# else:
# resp = service.post("annotations", json = new_annotation)
#
# if resp.ok:
# new_annotation = resp.json()["_id"]
# log.access_flask_annotate_document(document, new_annotation)
#
# return jsonify(resp.ok)
def set_document_to_annotated_by_user(doc_id, user_id):
@@ -242,16 +240,76 @@ def set_document_to_annotated_by_user(doc_id, user_id):
document
:param doc_id: str
:param user_id: str
:return: Response | None
:return: whether the update succeeded
:rtype: bool
"""
document = service.get_item_by_id("/documents", doc_id)
document = service.get_item_by_id("/documents", doc_id, params={
"projection": json.dumps({
"has_annotated": 1
})
})
if "has_annotated" in document and user_id in document["has_annotated"] and document["has_annotated"][user_id]:
return True
new_document = {
"has_annotated": document["has_annotated"]
"has_annotated": document["has_annotated"] if "has_annotated" in document else {}
}
new_document["has_annotated"][user_id] = True
headers = {"If-Match": document["_etag"]}
return service.patch(["documents", doc_id], json=new_document, headers=headers).ok
def _make_annotations(body):
if not isinstance(body, dict) or "doc" not in body or "ner" not in body:
raise exceptions.BadRequest()
if (not isinstance(body["doc"], list) and not isinstance(body["doc"], tuple)) or \
(not isinstance(body["ner"], list) and not isinstance(body["ner"], tuple)):
raise exceptions.BadRequest()
doc_labels = body["doc"]
for ann in doc_labels:
if not isinstance(ann, str) or len(ann.strip()) == 0:
raise exceptions.BadRequest()
ner_annotations = body["ner"]
for (i, ann) in enumerate(ner_annotations):
if isinstance(ann, dict):
if "start" not in ann or "end" not in ann or "label" not in ann:
raise exceptions.BadRequest()
if not isinstance(ann["start"], int) or not isinstance(ann["end"], int) or \
not isinstance(ann["label"], str) or len(ann["label"].strip()) == 0:
raise exceptions.BadRequest()
ner_annotations[i] = (ann["start"], ann["end"], ann["label"])
elif isinstance(ann, list) or isinstance(ann, tuple):
if len(ann) != 3 or not isinstance(ann[0], int) or not isinstance(ann[1], int) or \
not isinstance(ann[2], str) or len(ann[2].strip()) == 0:
raise exceptions.BadRequest()
else:
raise exceptions.BadRequest()
ner_annotations.sort(key = lambda x: x[0])
return (doc_labels, ner_annotations)
def _add_or_update_annotation(new_annotation):
doc_id = new_annotation["document_id"]
user_id = new_annotation["creator_id"]
current_annotation = get_current_annotation(doc_id, user_id)
success = False
if current_annotation != None:
new_annotation["_id"] = current_annotation["_id"]
if current_annotation["annotation"] == new_annotation["annotation"]:
return new_annotation["_id"]
else:
headers = {"If-Match": current_annotation["_etag"]}
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
updated_annotated_field = set_document_to_annotated_by_user(doc_id, user_id)
resp = service.post("annotations", json = new_annotation)
success = resp.ok and updated_annotated_field
new_annotation["_id"] = resp.json()["_id"]
if success:
log.access_flask_annotate_document(new_annotation)
return new_annotation["_id"]
@bp.route("/mine/by_document_id/<doc_id>", methods = ["POST", "PUT"])
def save_annotations(doc_id):
@@ -260,47 +318,96 @@ def save_annotations(doc_id):
are already annotations, use a patch request to update with the new annotations. If there are not, use a post
request to create a new entry.
:param doc_id: str
:return: str
:return: bool
"""
# If you change input or output, update client modules pine.client.models and pine.client.client
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
document = service.get_item_by_id("documents", doc_id, params=service.params({
"projection": {
"collection_id": 1,
"metadata": 1
})
})
body = request.get_json()
if "doc" not in body or "ner" not in body:
raise exceptions.BadRequest()
labels = body["doc"]
annotations = [(ann["start"], ann["end"], ann["label"]) for ann in body["ner"]]
check_overlapping_annotations(document, annotations)
user_id = auth.get_logged_in_user()["id"]
}
}))
check_document(document)
body = request.get_json()
(doc_labels, ner_annotations) = _make_annotations(body)
collection = service.get_item_by_id("collections", document["collection_id"], params=service.params({
"projection": {
"configuration": 1
}
}))
check_overlapping_annotations(collection, ner_annotations)
user_id = auth.get_logged_in_user()["id"]
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": labels + annotations
"annotation": doc_labels + ner_annotations
}
return jsonify(_add_or_update_annotation(new_annotation))
@bp.route("/mine/by_collection_id/<collection_id>", methods = ["POST", "PUT"])
def save_collection_annotations(collection_id: str):
# If you change input or output, update client modules pine.client.models and pine.client.client
collection = service.get_item_by_id("collections", collection_id, params=service.params({
"projection": {
"configuration": 1,
"creator_id": 1,
"viewer": 1,
"annotators": 1
}
}))
if not collections.user_can_annotate(collection):
raise exceptions.Unauthorized()
if not request.is_json:
raise exceptions.BadRequest()
doc_annotations = request.get_json()
if not isinstance(doc_annotations, dict):
raise exceptions.BadRequest()
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == new_annotation["annotation"]:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
updated_annotated_field = set_document_to_annotated_by_user(doc_id, user_id)
resp = service.post("annotations", json = new_annotation)
skip_document_updates = json.loads(request.args.get("skip_document_updates", "false"))
# make sure all the documents actually belong to that collection
collection_ids = list(documents.get_collection_ids_for(doc_annotations.keys()))
if len(collection_ids) != 1 or collection_ids[0] != collection_id:
raise exceptions.Unauthorized()
user_id = auth.get_logged_in_user()["id"]
# first try batch mode
new_annotations = []
for (doc_id, body) in doc_annotations.items():
(doc_labels, ner_annotations) = _make_annotations(body)
check_overlapping_annotations(collection, ner_annotations)
new_annotations.append({
"creator_id": user_id,
"collection_id": collection_id,
"document_id": doc_id,
"annotation": doc_labels + ner_annotations
})
resp = service.post("annotations", json=new_annotations)
if resp.ok:
new_annotation["_id"] = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
for (i, created_annotation) in enumerate(resp.json()["_items"]):
new_annotations[i]["_id"] = created_annotation["_id"]
if not skip_document_updates:
set_document_to_annotated_by_user(new_annotations[i]["document_id"],
new_annotations[i]["creator_id"])
log.access_flask_annotate_documents(new_annotations)
return jsonify([annotation["_id"] for annotation in new_annotations])
# fall back on individual mode
added_ids = []
for annotation in new_annotations:
added_id = _add_or_update_annotation(annotation["document_id"], user_id, annotation)
if added_id:
added_ids.append(added_id)
return jsonify(added_ids)
def init_app(app):
app.register_blueprint(bp)

View File

@@ -6,14 +6,18 @@ import os
from . import log
log.setup_logging()
from flask import Flask, jsonify
from flask import Flask, abort, jsonify
from flask import __version__ as flask_version
from werkzeug import exceptions
from . import config
VERSION = os.environ.get("PINE_VERSION", "unknown-no-env")
LOGGER = logging.getLogger(__name__)
def handle_error(e):
logging.getLogger(__name__).error(e, exc_info=True)
return jsonify(e.description), e.code
return jsonify(str(e.description)), e.code
def handle_uncaught_exception(e):
if isinstance(e, exceptions.InternalServerError):
@@ -54,6 +58,23 @@ def create_app(test_config = None):
def ping():
return jsonify("pong")
from .data import service as service
@app.route("/about")
def about():
resp = service.get("about")
if not resp.ok:
abort(resp.status)
about = {
"version": VERSION,
"flask_version": flask_version,
"db": resp.json()
}
LOGGER.info("Eve service performance history:")
LOGGER.info(service.PERFORMANCE_HISTORY.pformat())
LOGGER.info("Version information:")
LOGGER.info(about)
return jsonify(about)
from . import cors
cors.init_app(app)

View File

@@ -69,6 +69,8 @@ def flask_get_login_form() -> Response:
@bp.route("/logout", methods = ["POST"])
def flask_post_logout() -> Response:
user = module.get_logged_in_user()
if user == None:
raise exceptions.BadRequest()
module.logout()
log.access_flask_logout(user)
return Response(status = 200)

View File

@@ -5,11 +5,11 @@ import bcrypt
import hashlib
def hash_password(password: str) -> str:
sha256 = hashlib.sha256(password.encode()).digest()
sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
hashed_password_bytes = bcrypt.hashpw(sha256, bcrypt.gensalt())
return base64.b64encode(hashed_password_bytes).decode()
def check_password(password: str, hashed_password: str):
sha256 = hashlib.sha256(password.encode()).digest()
sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
hashed_password_bytes = base64.b64decode(hashed_password.encode())
return bcrypt.checkpw(sha256, hashed_password_bytes)

View File

@@ -3,4 +3,4 @@
"""This module contains the api methods required to interact with, organize, create, and display collections in the
front-end and store the collections in the backend"""
from .bp import user_can_annotate, user_can_view, user_can_add_documents_or_images, user_can_modify_document_metadata, user_can_annotate_by_id, user_can_view_by_id, user_can_add_documents_or_images_by_id, user_can_modify_document_metadata_by_id
from .bp import user_can_annotate, user_can_view, user_can_add_documents_or_images, user_can_modify_document_metadata, user_can_annotate_by_id, user_can_annotate_by_ids, user_can_view_by_id, user_can_add_documents_or_images_by_id, user_can_modify_document_metadata_by_id

View File

@@ -16,14 +16,27 @@ from .. import auth, log
from ..data import service
bp = Blueprint("collections", __name__, url_prefix = "/collections")
logger = logging.getLogger(__name__)
LOGGER = logging.getLogger(__name__)
DOCUMENTS_PER_TRANSACTION = 500
# Cache this info for uploading large numbers of images sequentially
LAST_COLLECTION_FOR_IMAGE = None
def is_cached_last_collection(collection_id):
global LAST_COLLECTION_FOR_IMAGE
return LAST_COLLECTION_FOR_IMAGE and LAST_COLLECTION_FOR_IMAGE[0] == collection_id and LAST_COLLECTION_FOR_IMAGE[1] == auth.get_logged_in_user()["id"]
def update_cached_last_collection(collection_id):
global LAST_COLLECTION_FOR_IMAGE
LAST_COLLECTION_FOR_IMAGE = [collection_id, auth.get_logged_in_user()["id"]]
def _collection_user_can_projection():
return {"projection": json.dumps({
"creator_id": 1,
"annotators": 1,
"viewers": 1
})}
return service.params({
"projection": {
"creator_id": 1,
"annotators": 1,
"viewers": 1
}
})
def _collection_user_can(collection, annotate):
user_id = auth.get_logged_in_user()["id"]
@@ -50,6 +63,21 @@ def user_can_annotate_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return _collection_user_can(collection, annotate = True)
def user_can_annotate_by_ids(collection_ids):
collections = service.get_items("collections", params=service.params({
"where": {
"_id": {"$in": collection_ids}
}, "projection": {
"creator_id": 1,
"annotators": 1,
"viewers": 1
}
}))
for collection in collections:
if not user_can_annotate(collection):
return False
return True
def user_can_view_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return _collection_user_can(collection, annotate = False)
@@ -163,7 +191,7 @@ def get_collection(collection_id):
:param collection_id: str
:return: Response
"""
resp = service.get("collections/" + collection_id)
resp = service.get(["collections", collection_id])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
@@ -175,7 +203,7 @@ def get_collection(collection_id):
@bp.route("/by_id/<collection_id>/download", methods = ["GET"])
@auth.login_required
def download_collection(collection_id):
resp = service.get("/collections/" + collection_id)
resp = service.get(["collections", collection_id])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
@@ -291,7 +319,7 @@ def add_annotator_to_collection(collection_id):
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if user_id not in collection["annotators"]:
logger.info("new annotator: adding to collection")
LOGGER.info("new annotator: adding to collection")
collection["annotators"].append(user_id)
if user_id not in collection["viewers"]:
collection["viewers"].append(user_id)
@@ -324,7 +352,7 @@ def add_viewer_to_collection(collection_id):
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if user_id not in collection["viewers"]:
logger.info("new viewer: adding to collection")
LOGGER.info("new viewer: adding to collection")
collection["viewers"].append(user_id)
to_patch = {
"viewers": collection["viewers"]
@@ -350,7 +378,7 @@ def add_label_to_collection(collection_id):
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if new_label not in collection["labels"]:
logger.info("new viewer: adding to collection")
LOGGER.info("new label: adding to collection")
collection["labels"].append(new_label)
to_patch = {
"labels": collection["labels"]
@@ -375,6 +403,22 @@ def get_overlap_ids(collection_id):
return [doc["_id"] for doc in service.get_all_using_pagination("documents", params)['_items']]
def _upload_documents(collection, docs):
doc_resp = service.post("/documents", json=docs)
# TODO if it failed, roll back the created collection and classifier
if not doc_resp.ok:
abort(doc_resp.status_code, doc_resp.content)
r = doc_resp.json()
# TODO if it failed, roll back the created collection and classifier
if r["_status"] != "OK":
abort(400, "Unable to create documents")
for obj in r["_items"]:
if obj["_status"] != "OK":
abort(400, "Unable to create documents")
doc_ids = [obj["_id"] for obj in r["_items"]]
LOGGER.info("Added {} docs to collection {}".format(len(doc_ids), collection["_id"]))
return doc_ids
# Require a multipart form post:
# CSV is in the form file "file"
# Optional images are in the form file fields "imageFileN" where N is an (ignored) index
@@ -387,6 +431,7 @@ def get_overlap_ids(collection_id):
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
def create_collection():
# If you change the requirements here, also update the client module pine.client.models
"""
Create a new collection based upon the entries provided in the POST request's associated form fields.
These fields include:
@@ -394,7 +439,7 @@ def create_collection():
overlap - ratio of overlapping documents. (0-1) with 0 being no overlap and 1 being every document has overlap, ex:
.90 - 90% of documents overlap
train_every - automatically train a new classifier after this many documents have been annotated
pipeline_id - the id value of the classifier pipeline associated with this collection (spacy, opennlp, corenlp)
pipelineId - the id value of the classifier pipeline associated with this collection (spacy, opennlp, corenlp)
classifierParameters - optional parameters that adjust the configuration of the chosen classifier pipeline.
archived - whether or not this collection should be archived.
A collection can be created with documents listed in a csv file. Each new line in the csv represents a new document.
@@ -451,7 +496,7 @@ def create_collection():
collection_id = r["_id"]
collection["_id"] = collection_id
log.access_flask_add_collection(collection)
logger.info("Created collection", collection_id)
LOGGER.info("Created collection {}".format(collection_id))
#create classifier
# require collection_id, overlap, pipeline_id and labels
@@ -473,7 +518,7 @@ def create_collection():
if r["_status"] != "OK":
abort(400, "Unable to create classifier")
classifier_id = r["_id"]
logger.info("Created classifier", classifier_id)
LOGGER.info("Created classifier {}".format(classifier_id))
# create metrics for classifier
# require collection_id, classifier_id, document_ids and annotations ids
@@ -492,7 +537,7 @@ def create_collection():
if r["_status"] != "OK":
abort(400, "Unable to create metrics")
metrics_id = r["_id"]
logger.info("Created metrics", metrics_id)
LOGGER.info("Created metrics {}".format(metrics_id))
#create documents if CSV file was sent in
doc_ids = []
@@ -529,20 +574,12 @@ def create_collection():
else:
doc["overlap"] = 0
docs.append(doc)
doc_resp = service.post("/documents", json=docs)
# TODO if it failed, roll back the created collection and classifier
if not doc_resp.ok:
abort(doc_resp.status_code, doc_resp.content)
r = doc_resp.json()
# TODO if it failed, roll back the created collection and classifier
if r["_status"] != "OK":
abort(400, "Unable to create documents")
logger.info(r["_items"])
for obj in r["_items"]:
if obj["_status"] != "OK":
abort(400, "Unable to create documents")
doc_ids = [obj["_id"] for obj in r["_items"]]
logger.info("Added docs:", doc_ids)
if len(docs) >= DOCUMENTS_PER_TRANSACTION:
doc_ids += _upload_documents(collection, docs)
docs = []
if len(docs) > 0:
doc_ids += _upload_documents(collection, docs)
docs = []
# create next ids
(doc_ids, overlap_ids) = get_doc_and_overlap_ids(collection_id)
@@ -568,8 +605,9 @@ def create_collection():
def _check_collection_and_get_image_dir(collection_id, path):
# make sure user can view collection
if not user_can_view_by_id(collection_id):
raise exceptions.Unauthorized()
if not is_cached_last_collection(collection_id):
if not user_can_view_by_id(collection_id):
raise exceptions.Unauthorized()
image_dir = current_app.config["DOCUMENT_IMAGE_DIR"]
if image_dir == None or len(image_dir) == 0:
@@ -582,6 +620,24 @@ def _check_collection_and_get_image_dir(collection_id, path):
return os.path.realpath(image_dir)
@bp.route("/static_images/<collection_id>", methods=["GET"])
@auth.login_required
def get_static_collection_images(collection_id):
static_image_dir = os.path.join(_check_collection_and_get_image_dir(collection_id, "static/"), "static")
urls = []
for _, _, filenames in os.walk(static_image_dir):
urls += ["/static/{}".format(f) for f in filenames]
return jsonify(urls)
@bp.route("/images/<collection_id>", methods=["GET"])
@auth.login_required
def get_collection_images(collection_id):
collection_image_dir = _check_collection_and_get_image_dir(collection_id, "")
urls = []
for _, _, filenames in os.walk(collection_image_dir):
urls += ["/{}".format(f) for f in filenames]
return jsonify(urls)
@bp.route("/image/<collection_id>/<path:path>", methods=["GET"])
@auth.login_required
def get_collection_image(collection_id, path):
@@ -639,11 +695,14 @@ def _upload_collection_image_file(collection_id, path, image_file):
@bp.route("/image/<collection_id>/<path:path>", methods=["POST", "PUT"])
@auth.login_required
def post_collection_image(collection_id, path):
if not user_can_add_documents_or_images_by_id(collection_id):
raise exceptions.Unauthorized()
if not is_cached_last_collection(collection_id):
if not user_can_add_documents_or_images_by_id(collection_id):
raise exceptions.Unauthorized()
if "file" not in request.files:
raise exceptions.BadRequest("Missing file form part.")
update_cached_last_collection(collection_id)
return jsonify(_upload_collection_image_file(collection_id, path, request.files["file"]))
def init_app(app):

View File

@@ -24,7 +24,8 @@ else:
REDIS_PORT = int(os.environ.get("REDIS_PORT", 6479))
AUTH_MODULE = os.environ.get("AUTH_MODULE", "vegas")
if not AUTH_MODULE: AUTH_MODULE = "vegas"
VEGAS_CLIENT_SECRET = os.environ.get("VEGAS_CLIENT_SECRET", None)
DOCUMENT_IMAGE_DIR = os.environ.get("DOCUMENT_IMAGE_DIR")
DOCUMENT_IMAGE_DIR = os.environ.get("DOCUMENT_IMAGE_DIR", "/mnt/azure")

View File

@@ -3,7 +3,7 @@
import json
import logging
import math
from pprint import pprint
from pprint import pformat, pprint
import threading
from flask import abort, current_app, Response
@@ -23,6 +23,9 @@ class PerformanceHistory(object):
}
self.lock = threading.Lock()
def pformat(self, **kwargs):
return pformat(self.data, **kwargs)
def pprint(self):
self.lock.acquire()
try:
@@ -53,6 +56,7 @@ class PerformanceHistory(object):
PERFORMANCE_HISTORY = PerformanceHistory()
def _standardize_path(path, *additional_paths):
# if you change this, also update client code in pine.client.client module
if type(path) not in [list, tuple, set]:
path = [path]
if additional_paths:

View File

@@ -12,7 +12,16 @@ def get_all_users():
return service.get_items("/users")
def get_user(user_id):
return service.get_item_by_id("/users", user_id)
# getting by ID in the normal way doesn't work sometimes
items = service.get_items("users", params=service.params({
"where": {
"_id": user_id
}
}))
if items and len(items) == 1:
return items[0]
else:
abort(404)
def get_user_by_email(email):
where = {
@@ -107,7 +116,7 @@ def reset_user_passwords():
service.remove_nonupdatable_fields(user)
headers = {"If-Match": etag}
click.echo("Putting to {}: {}".format(service.url("users", user["_id"]), user))
resp = service.put("users", user["_id"], json = user, headers = headers)
resp = service.put(["users", user["_id"]], json = user, headers = headers)
if not resp.ok:
click.echo("Failure! {}".format(resp))
else:

View File

@@ -1,6 +1,7 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import random
from flask import abort, Blueprint, jsonify, request
from werkzeug import exceptions
@@ -15,6 +16,18 @@ def _document_user_can_projection():
"collection_id": 1
}})
def get_collection_ids_for(document_ids) -> set:
if isinstance(document_ids, str):
document_ids = [document_ids]
# ideally we would use some "unique" or "distinct" feature here but eve doesn't seem to have it
return set(item["collection_id"] for item in service.get_items("documents", params=service.params({
"where": {
"_id": {"$in": list(document_ids)}
}, "projection": {
"collection_id": 1
}
})))
def user_can_annotate(document):
return collections.user_can_annotate_by_id(document["collection_id"])
@@ -28,6 +41,9 @@ def user_can_annotate_by_id(document_id):
document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
return user_can_annotate(document)
def user_can_annotate_by_ids(document_ids):
return collections.user_can_annotate_by_ids(get_collection_ids_for(document_ids))
def user_can_view_by_id(document_id):
document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
return user_can_view(document)
@@ -63,34 +79,147 @@ def get_documents_in_collection(col_id, page):
"collection_id": col_id
})
if truncate:
params["projection"] = json.dumps({"metadata": 0})
params["truncate"] = truncate_length
if truncate_length == 0:
params["projection"] = json.dumps({
"metadata": 0,
"text": 0
})
else:
params["projection"] = json.dumps({
"metadata": 0
})
params["truncate"] = truncate_length
if page == "all":
return jsonify(service.get_all_using_pagination("documents", params))
if page: params["page"] = page
resp = service.get("documents", params = params)
if not resp.ok:
abort(resp.status_code, resp.content)
data = resp.json()
if truncate:
if truncate and truncate_length != 0:
for document in data["_items"]:
document["text"] = document["text"][0:truncate_length]
return jsonify(data)
def _check_documents(documents) -> dict:
collection_ids = set()
for document in documents:
if not isinstance(document, dict) or "collection_id" not in document or not document["collection_id"]:
raise exceptions.BadRequest()
collection_ids.add(document["collection_id"])
collections_by_id = {}
for collection_id in collection_ids:
collection = service.get_item_by_id("collections", collection_id, params=service.params({
"projection": {
"creator_id": 1,
"annotators": 1,
"viewers": 1
}
}))
if not collections.user_can_add_documents_or_images(collection):
raise exceptions.Unauthorized()
collections_by_id[collection_id] = collection
return collections_by_id
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
def add_document():
document = request.get_json()
if not document or "collection_id" not in document or not document["collection_id"]:
docs = request.get_json()
if not docs or (not isinstance(docs, dict) and not isinstance(docs, list) and not isinstance(docs, tuple)):
raise exceptions.BadRequest()
if not collections.user_can_add_documents_or_images_by_id(document["collection_id"]):
raise exceptions.Unauthorized()
resp = service.post("documents", json=document)
if resp.ok:
log.access_flask_add_document(resp.json())
return service.convert_response(resp)
collections_by_id = _check_documents(docs if isinstance(docs, list) else [docs])
# Get overlap information stored in related classifier db object, and assign overlap for added document
collection_classifiers = {}
for doc in (docs if isinstance(docs, list) else [docs]):
# get classifier overlaps
if doc["collection_id"] not in collection_classifiers:
params = service.params({
"where": {
"collection_id": doc["collection_id"]
}, "projection": {
"overlap": 1
}
})
resp = service.get("classifiers", params=params)
if not resp.ok:
abort(resp.status_code)
classifier_obj = resp.json()["_items"]
if len(classifier_obj) != 1:
raise exceptions.BadRequest()
collection_classifiers[doc["collection_id"]] = classifier_obj[0]
classifier = collection_classifiers[doc["collection_id"]]
overlap = classifier["overlap"]
doc["overlap"] = 1 if random.random() < overlap else 0
# initialize has_annotated dict
if "has_annotated" not in doc:
doc["has_annotated"] = {user_id: False for user_id in collections_by_id[doc["collection_id"]]["annotators"]}
# Add document(s) to database
doc_resp = service.post("documents", json=docs)
if doc_resp.ok:
if isinstance(docs, dict):
log.access_flask_add_document(doc_resp.json())
else:
log.access_flask_add_documents(doc_resp.json()["_items"])
else:
abort(doc_resp.status_code)
if isinstance(docs, dict):
docs = [docs]
doc_ids = [doc_resp.json()["_id"]]
else:
doc_ids = [d["_id"] for d in doc_resp.json()["_items"]]
# Update next instances for added documents
classifier_next_instances = {}
for (i, document) in enumerate(docs):
doc_id = doc_ids[i]
classifier = collection_classifiers[document["collection_id"]]
classifier_id = classifier["_id"]
if classifier_id not in classifier_next_instances:
# Get next_instances to which we'll add document
next_instances_params = service.params({
"where": {
"classifier_id": classifier_id
}, "projection": {
"overlap_document_ids": 1,
"document_ids": 1
}
})
resp = service.get("next_instances", params=next_instances_params)
if not resp.ok:
abort(resp.status_code)
next_instances_obj = resp.json()["_items"]
if len(next_instances_obj) != 1:
raise exceptions.BadRequest()
classifier_next_instances[classifier_id] = next_instances_obj[0]
next_instances = classifier_next_instances[classifier_id]
if document["overlap"] == 1:
# Add document to overlap IDs for each annotator if it's an overlap document
for annotator in next_instances["overlap_document_ids"]:
next_instances["overlap_document_ids"][annotator].append(doc_id)
else:
# Add document to document_ids if it's not an overlap document
next_instances["document_ids"].append(doc_id)
# Patch next_instances with new documents
for next_instances in classifier_next_instances.values():
headers = {"If-Match": next_instances["_etag"]}
service.remove_nonupdatable_fields(next_instances)
resp = service.patch(["next_instances", next_instances["_id"]], json=next_instances, headers=headers)
if not resp.ok:
raise exceptions.BadRequest()
return service.convert_response(doc_resp)
@bp.route("/can_annotate/<doc_id>", methods = ["GET"])
@auth.login_required

View File

@@ -1,7 +1,10 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import enum
import json
import logging.config
import os
import typing
# make sure this package has been installed
import pythonjsonlogger
@@ -17,7 +20,9 @@ class Action(enum.Enum):
CREATE_COLLECTION = enum.auto()
VIEW_DOCUMENT = enum.auto()
ADD_DOCUMENT = enum.auto()
ADD_DOCUMENTS = enum.auto()
ANNOTATE_DOCUMENT = enum.auto()
ANNOTATE_DOCUMENTS = enum.auto()
def setup_logging():
if CONFIG_FILE_ENV not in os.environ:
@@ -51,10 +56,10 @@ def get_flask_logged_in_user():
def access_flask_login():
access(Action.LOGIN, get_flask_logged_in_user(), get_flask_request_info(), None)
def access_flask_logout(user):
def access_flask_logout(user: dict):
access(Action.LOGOUT, {"id": user["id"], "username": user["username"]}, get_flask_request_info(), None)
def access_flask_add_collection(collection):
def access_flask_add_collection(collection: dict):
extra_info = {
"collection_id": collection["_id"]
}
@@ -65,7 +70,7 @@ def access_flask_add_collection(collection):
del extra_info["collection_metadata"][k]
access(Action.CREATE_COLLECTION, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_view_document(document):
def access_flask_view_document(document: dict):
extra_info = {
"document_id": document["_id"]
}
@@ -73,7 +78,7 @@ def access_flask_view_document(document):
extra_info["document_metadata"] = document["metadata"]
access(Action.VIEW_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_add_document(document):
def access_flask_add_document(document: dict):
extra_info = {
"document_id": document["_id"]
}
@@ -81,15 +86,31 @@ def access_flask_add_document(document):
extra_info["document_metadata"] = document["metadata"]
access(Action.ADD_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_annotate_document(document, annotation):
def access_flask_add_documents(documents: typing.List[dict]):
doc_info = []
for document in documents:
i = {"id": document["_id"]}
if "metadata" in document: i["metadata"] = document["metadata"]
doc_info.append(i)
extra_info = {
"document_id": document["_id"],
"documents": doc_info
}
access(Action.ADD_DOCUMENTS, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_annotate_document(annotation):
extra_info = {
"document_id": annotation["document_id"],
"annotation_id": annotation["_id"]
}
if "metadata" in document:
extra_info["document_metadata"] = document["metadata"]
access(Action.ANNOTATE_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_annotate_documents(annotations: typing.List[dict]):
extra_info = {
"document_ids": [annotation["document_id"] for annotation in annotations],
"annotation_ids": [annotation["_id"] for annotation in annotations]
}
access(Action.ANNOTATE_DOCUMENTS, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
###############
def access(action, user, request_info, message, **extra_info):

View File

@@ -67,7 +67,7 @@ def fix_num_for_json(number):
def getIAAReportForCollection(collection_id):
combined = get_doc_annotations(collection_id) ## exclude=set(['bchee1'])
combined = get_doc_annotations(collection_id)
labels = set()
for v in combined.values():

View File

@@ -66,16 +66,6 @@ def _get_classifier_metrics(classifier_id):
logger.info(all_metrics)
return all_metrics
def _get_collection_classifier(collection_id):
where = {
"collection_id": collection_id
}
classifiers = service.get_items("/classifiers", params=service.where_params(where))
if len(classifiers) != 1:
raise exceptions.BadRequest(description="Expected one classifier but found {}.".format(len(classifiers)))
return classifiers[0]
@bp.route("/metrics", methods=["GET"])
@auth.login_required
def get_metrics():
@@ -140,6 +130,8 @@ def get_next_by_classifier(classifier_id):
return jsonify(instance["overlap_document_ids"][user_id].pop())
elif len(instance["document_ids"]) > 0:
return jsonify(instance["document_ids"].pop())
elif len(instance["overlap_document_ids"][user_id]) > 0:
return jsonify(instance["overlap_document_ids"][user_id].pop())
else:
return jsonify(None)

18
client/Pipfile Normal file
View File

@@ -0,0 +1,18 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
pymongo = "*"
requests = "*"
overrides = "*"
python-json-logger = "*"
bcrypt = "*"
[dev-packages]
[requires]
python_version = "3.6"

201
client/Pipfile.lock generated Normal file
View File

@@ -0,0 +1,201 @@
{
"_meta": {
"hash": {
"sha256": "a97ae1c4a3394a19df62fcd5603bd637df6dbdda51e75c91bc5594cb1a68ac48"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.6"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"bcrypt": {
"hashes": [
"sha256:0258f143f3de96b7c14f762c770f5fc56ccd72f8a1857a451c1cd9a655d9ac89",
"sha256:0b0069c752ec14172c5f78208f1863d7ad6755a6fae6fe76ec2c80d13be41e42",
"sha256:19a4b72a6ae5bb467fea018b825f0a7d917789bcfe893e53f15c92805d187294",
"sha256:5432dd7b34107ae8ed6c10a71b4397f1c853bd39a4d6ffa7e35f40584cffd161",
"sha256:6305557019906466fc42dbc53b46da004e72fd7a551c044a827e572c82191752",
"sha256:69361315039878c0680be456640f8705d76cb4a3a3fe1e057e0f261b74be4b31",
"sha256:6fe49a60b25b584e2f4ef175b29d3a83ba63b3a4df1b4c0605b826668d1b6be5",
"sha256:74a015102e877d0ccd02cdeaa18b32aa7273746914a6c5d0456dd442cb65b99c",
"sha256:763669a367869786bb4c8fcf731f4175775a5b43f070f50f46f0b59da45375d0",
"sha256:8b10acde4e1919d6015e1df86d4c217d3b5b01bb7744c36113ea43d529e1c3de",
"sha256:9fe92406c857409b70a38729dbdf6578caf9228de0aef5bc44f859ffe971a39e",
"sha256:a190f2a5dbbdbff4b74e3103cef44344bc30e61255beb27310e2aec407766052",
"sha256:a595c12c618119255c90deb4b046e1ca3bcfad64667c43d1166f2b04bc72db09",
"sha256:c9457fa5c121e94a58d6505cadca8bed1c64444b83b3204928a866ca2e599105",
"sha256:cb93f6b2ab0f6853550b74e051d297c27a638719753eb9ff66d1e4072be67133",
"sha256:ce4e4f0deb51d38b1611a27f330426154f2980e66582dc5f438aad38b5f24fc1",
"sha256:d7bdc26475679dd073ba0ed2766445bb5b20ca4793ca0db32b399dccc6bc84b7",
"sha256:ff032765bb8716d9387fd5376d987a937254b0619eff0972779515b5c98820bc"
],
"index": "pypi",
"version": "==3.1.7"
},
"certifi": {
"hashes": [
"sha256:5930595817496dd21bb8dc35dad090f1c2cd0adfaf21204bf6732ca5d8ee34d3",
"sha256:8fc0819f1f30ba15bdb34cceffb9ef04d99f420f68eb75d901e9560b8749fc41"
],
"version": "==2020.6.20"
},
"cffi": {
"hashes": [
"sha256:001bf3242a1bb04d985d63e138230802c6c8d4db3668fb545fb5005ddf5bb5ff",
"sha256:00789914be39dffba161cfc5be31b55775de5ba2235fe49aa28c148236c4e06b",
"sha256:028a579fc9aed3af38f4892bdcc7390508adabc30c6af4a6e4f611b0c680e6ac",
"sha256:14491a910663bf9f13ddf2bc8f60562d6bc5315c1f09c704937ef17293fb85b0",
"sha256:1cae98a7054b5c9391eb3249b86e0e99ab1e02bb0cc0575da191aedadbdf4384",
"sha256:2089ed025da3919d2e75a4d963d008330c96751127dd6f73c8dc0c65041b4c26",
"sha256:2d384f4a127a15ba701207f7639d94106693b6cd64173d6c8988e2c25f3ac2b6",
"sha256:337d448e5a725bba2d8293c48d9353fc68d0e9e4088d62a9571def317797522b",
"sha256:399aed636c7d3749bbed55bc907c3288cb43c65c4389964ad5ff849b6370603e",
"sha256:3b911c2dbd4f423b4c4fcca138cadde747abdb20d196c4a48708b8a2d32b16dd",
"sha256:3d311bcc4a41408cf5854f06ef2c5cab88f9fded37a3b95936c9879c1640d4c2",
"sha256:62ae9af2d069ea2698bf536dcfe1e4eed9090211dbaafeeedf5cb6c41b352f66",
"sha256:66e41db66b47d0d8672d8ed2708ba91b2f2524ece3dee48b5dfb36be8c2f21dc",
"sha256:675686925a9fb403edba0114db74e741d8181683dcf216be697d208857e04ca8",
"sha256:7e63cbcf2429a8dbfe48dcc2322d5f2220b77b2e17b7ba023d6166d84655da55",
"sha256:8a6c688fefb4e1cd56feb6c511984a6c4f7ec7d2a1ff31a10254f3c817054ae4",
"sha256:8c0ffc886aea5df6a1762d0019e9cb05f825d0eec1f520c51be9d198701daee5",
"sha256:95cd16d3dee553f882540c1ffe331d085c9e629499ceadfbda4d4fde635f4b7d",
"sha256:99f748a7e71ff382613b4e1acc0ac83bf7ad167fb3802e35e90d9763daba4d78",
"sha256:b8c78301cefcf5fd914aad35d3c04c2b21ce8629b5e4f4e45ae6812e461910fa",
"sha256:c420917b188a5582a56d8b93bdd8e0f6eca08c84ff623a4c16e809152cd35793",
"sha256:c43866529f2f06fe0edc6246eb4faa34f03fe88b64a0a9a942561c8e22f4b71f",
"sha256:cab50b8c2250b46fe738c77dbd25ce017d5e6fb35d3407606e7a4180656a5a6a",
"sha256:cef128cb4d5e0b3493f058f10ce32365972c554572ff821e175dbc6f8ff6924f",
"sha256:cf16e3cf6c0a5fdd9bc10c21687e19d29ad1fe863372b5543deaec1039581a30",
"sha256:e56c744aa6ff427a607763346e4170629caf7e48ead6921745986db3692f987f",
"sha256:e577934fc5f8779c554639376beeaa5657d54349096ef24abe8c74c5d9c117c3",
"sha256:f2b0fa0c01d8a0c7483afd9f31d7ecf2d71760ca24499c8697aeb5ca37dc090c"
],
"version": "==1.14.0"
},
"chardet": {
"hashes": [
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
],
"version": "==3.0.4"
},
"idna": {
"hashes": [
"sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6",
"sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0"
],
"version": "==2.10"
},
"overrides": {
"hashes": [
"sha256:30f761124579e59884b018758c4d7794914ef02a6c038621123fec49ea7599c6"
],
"index": "pypi",
"version": "==3.1.0"
},
"pycparser": {
"hashes": [
"sha256:2d475327684562c3a96cc71adf7dc8c4f0565175cf86b6d7a404ff4c771f15f0",
"sha256:7582ad22678f0fcd81102833f60ef8d0e57288b6b5fb00323d101be910e35705"
],
"version": "==2.20"
},
"pymongo": {
"hashes": [
"sha256:01b4e10027aef5bb9ecefbc26f5df3368ce34aef81df43850f701e716e3fe16d",
"sha256:0fc5aa1b1acf7f61af46fe0414e6a4d0c234b339db4c03a63da48599acf1cbfc",
"sha256:1396eb7151e0558b1f817e4b9d7697d5599e5c40d839a9f7270bd90af994ad82",
"sha256:18e84a3ec5e73adcb4187b8e5541b2ad61d716026ed9863267e650300d8bea33",
"sha256:19adf2848b80cb349b9891cc854581bbf24c338be9a3260e73159bdeb2264464",
"sha256:20ee0475aa2ba437b0a14806f125d696f90a8433d820fb558fdd6f052acde103",
"sha256:26798795097bdeb571f13942beef7e0b60125397811c75b7aa9214d89880dd1d",
"sha256:26e707a4eb851ec27bb969b5f1413b9b2eac28fe34271fa72329100317ea7c73",
"sha256:2a3c7ad01553b27ec553688a1e6445e7f40355fb37d925c11fcb50b504e367f8",
"sha256:2f07b27dbf303ea53f4147a7922ce91a26b34a0011131471d8aaf73151fdee9a",
"sha256:316f0cf543013d0c085e15a2c8abe0db70f93c9722c0f99b6f3318ff69477d70",
"sha256:31d11a600eea0c60de22c8bdcb58cda63c762891facdcb74248c36713240987f",
"sha256:334ef3ffd0df87ea83a0054454336159f8ad9c1b389e19c0032d9cb8410660e6",
"sha256:358ba4693c01022d507b96a980ded855a32dbdccc3c9331d0667be5e967f30ed",
"sha256:3a6568bc53103df260f5c7d2da36dffc5202b9a36c85540bba1836a774943794",
"sha256:444bf2f44264578c4085bb04493bfed0e5c1b4fe7c2704504d769f955cc78fe4",
"sha256:47a00b22c52ee59dffc2aad02d0bbfb20c26ec5b8de8900492bf13ad6901cf35",
"sha256:4c067db43b331fc709080d441cb2e157114fec60749667d12186cc3fc8e7a951",
"sha256:4c092310f804a5d45a1bcaa4191d6d016c457b6ed3982a622c35f729ff1c7f6b",
"sha256:53b711b33134e292ef8499835a3df10909c58df53a2a0308f598c432e9a62892",
"sha256:568d6bee70652d8a5af1cd3eec48b4ca1696fb1773b80719ebbd2925b72cb8f6",
"sha256:56fa55032782b7f8e0bf6956420d11e2d4e9860598dfe9c504edec53af0fc372",
"sha256:5a2c492680c61b440272341294172fa3b3751797b1ab983533a770e4fb0a67ac",
"sha256:61235cc39b5b2f593086d1d38f3fc130b2d125bd8fc8621d35bc5b6bdeb92bd2",
"sha256:619ac9aaf681434b4d4718d1b31aa2f0fce64f2b3f8435688fcbdc0c818b6c54",
"sha256:6238ac1f483494011abde5286282afdfacd8926659e222ba9b74c67008d3a58c",
"sha256:63752a72ca4d4e1386278bd43d14232f51718b409e7ac86bcf8810826b531113",
"sha256:6fdc5ccb43864065d40dd838437952e9e3da9821b7eac605ba46ada77f846bdf",
"sha256:7abc3a6825a346fa4621a6f63e3b662bbb9e0f6ffc32d30a459d695f20fb1a8b",
"sha256:7aef381bb9ae8a3821abd7f9d4d93978dbd99072b48522e181baeffcd95b56ae",
"sha256:80df3caf251fe61a3f0c9614adc6e2bfcffd1cd3345280896766712fb4b4d6d7",
"sha256:95f970f34b59987dee6f360d2e7d30e181d58957b85dff929eee4423739bd151",
"sha256:993257f6ca3cde55332af1f62af3e04ca89ce63c08b56a387cdd46136c72f2fa",
"sha256:9c0a57390549affc2b5dda24a38de03a5c7cbc58750cd161ff5d106c3c6eec80",
"sha256:a0794e987d55d2f719cc95fcf980fc62d12b80e287e6a761c4be14c60bd9fecc",
"sha256:a3b98121e68bf370dd8ea09df67e916f93ea95b52fc010902312168c4d1aff5d",
"sha256:a60756d55f0887023b3899e6c2923ba5f0042fb11b1d17810b4e07395404f33e",
"sha256:a676bd2fbc2309092b9bbb0083d35718b5420af3a42135ebb1e4c3633f56604d",
"sha256:a732838c78554c1257ff2492f5c8c4c7312d0aecd7f732149e255f3749edd5ee",
"sha256:ae65d65fde4135ef423a2608587c9ef585a3551fc2e4e431e7c7e527047581be",
"sha256:b070a4f064a9edb70f921bfdc270725cff7a78c22036dd37a767c51393fb956f",
"sha256:b6da85949aa91e9f8c521681344bd2e163de894a5492337fba8b05c409225a4f",
"sha256:bbf47110765b2a999803a7de457567389253f8670f7daafb98e059c899ce9764",
"sha256:c06b3f998d2d7160db58db69adfb807d2ec307e883e2f17f6b87a1ef6c723f11",
"sha256:c318fb70542be16d3d4063cde6010b1e4d328993a793529c15a619251f517c39",
"sha256:c4aef42e5fa4c9d5a99f751fb79caa880dac7eaf8a65121549318b984676a1b7",
"sha256:c9ca545e93a9c2a3bdaa2e6e21f7a43267ff0813e8055adf2b591c13164c0c57",
"sha256:da2c3220eb55c4239dd8b982e213da0b79023cac59fe54ca09365f2bc7e4ad32",
"sha256:dd8055da300535eefd446b30995c0813cc4394873c9509323762a93e97c04c03",
"sha256:e2b46e092ea54b732d98c476720386ff2ccd126de1e52076b470b117bff7e409",
"sha256:e334c4f39a2863a239d38b5829e442a87f241a92da9941861ee6ec5d6380b7fe",
"sha256:e5c54f04ca42bbb5153aec5d4f2e3d9f81e316945220ac318abd4083308143f5",
"sha256:f96333f9d2517c752c20a35ff95de5fc2763ac8cdb1653df0f6f45d281620606"
],
"index": "pypi",
"version": "==3.10.1"
},
"python-json-logger": {
"hashes": [
"sha256:b7a31162f2a01965a5efb94453ce69230ed208468b0bbc7fdfc56e6d8df2e281"
],
"index": "pypi",
"version": "==0.1.11"
},
"requests": {
"hashes": [
"sha256:b3559a131db72c33ee969480840fff4bb6dd111de7dd27c8ee1f820f4f00231b",
"sha256:fe75cc94a9443b9246fc7049224f75604b113c36acb93f87b80ed42c44cbb898"
],
"index": "pypi",
"version": "==2.24.0"
},
"six": {
"hashes": [
"sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259",
"sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"
],
"version": "==1.15.0"
},
"urllib3": {
"hashes": [
"sha256:3018294ebefce6572a474f0604c2021e33b3fd8006ecd11d62107a5d2a963527",
"sha256:88206b0eb87e6d677d424843ac5209e3fb9d0190d0ee169599165ec25e9d9115"
],
"version": "==1.25.9"
}
},
"develop": {}
}

14
client/README.md Normal file
View File

@@ -0,0 +1,14 @@
&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
## Developer Environment
Required packages:
* python 3.6
* pipenv
`pipenv install --dev`
### Running client interactively
`./interactive_dev_run.sh` will connect the client using the dev stack values and then drop into
an interactive shell to allow interaction with the client.

27
client/interactive_dev_run.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
BACKEND_PORT=${BACKEND_PORT:-5000}
EVE_PORT=${EVE_PORT:-5001}
MONGO_PORT=${MONGO_PORT:-27018}
pushd ${DIR} &> /dev/null
read -r -d '' CODE << EOF
import code;
import sys;
sys.path.append("${DIR}");
import pine.client;
pine.client.setup_logging();
client = pine.client.LocalPineClient("http://localhost:${BACKEND_PORT}", "http://localhost:${EVE_PORT}", "mongodb://localhost:${MONGO_PORT}");
code.interact(banner="",exitmsg="",local=locals())
EOF
echo "${CODE}"
echo ""
PINE_LOGGING_CONFIG_FILE=$(realpath ${DIR}/../shared/logging.python.dev.json) pipenv run python3 -c "${CODE}"
popd &> /dev/null

View File

@@ -0,0 +1,20 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
source ${DIR}/../.env
export BACKEND_PORT
export EVE_PORT
export MONGO_PORT
if ! wget http://localhost:${BACKEND_PORT}/ping -O /dev/null -o /dev/null || ! wget http://localhost:${EVE_PORT} -O /dev/null -o /dev/null || ! wget http://localhost:${MONGO_PORT} -O /dev/null -o /dev/null; then
echo "Use docker-compose.test.yml when running docker compose stack."
exit 1
fi
pushd ${DIR} &> /dev/null
./interactive_dev_run.sh
popd &> /dev/null

4
client/pine/__init__.py Normal file
View File

@@ -0,0 +1,4 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""PINE main module.
"""

View File

@@ -0,0 +1,8 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""PINE client module.
"""
from .client import PineClient, LocalPineClient
from .log import setup_logging
from .models import CollectionBuilder

View File

@@ -0,0 +1,737 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""PINE client classes module.
"""
import abc
import json
import logging
import typing
from overrides import overrides
import pymongo
import requests
from . import exceptions, models, password
def _standardize_path(path: str, *additional_paths: typing.List[str]) -> typing.List[str]:
r"""Standardize path(s) into a list of path components.
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components in a list
:type \*additional_paths: list(str), optional
:return: the standardized path components in a list
:rtype: list(str)
"""
# if you change this, also update backend module pine.backend.data.service
if type(path) not in [list, tuple, set]:
path = [path]
if additional_paths:
path += additional_paths
# for every element in path, split by "/" into a list of paths, then remove empty values
# "/test" => ["test"], ["/test", "1"] => ["test", "1"], etc.
return [single_path for subpath in path for single_path in subpath.split("/") if single_path]
class BaseClient(object):
"""Base class for a client using a REST interface.
"""
__metaclass__ = abc.ABCMeta
def __init__(self, base_uri: str, name: str = None):
"""Constructor.
:param base_uri: the base URI for the server, e.g. ``"http://localhost:5000"``
:type base_uri: str
:param name: optional human-readable name for the server, defaults to None
:type name: str, optional
"""
self.base_uri: str = base_uri.strip("/")
"""The server's base URI.
:type: str
"""
self.session: requests.Session = None
"""The currently open session, or ``None``.
:type: requests.Session
"""
self.name: str = name
self.logger: logging.Logger = logging.getLogger(self.__class__.__name__)
@abc.abstractmethod
def is_valid(self) -> bool:
"""Returns whether this client and its connection(s) are valid.
:return: whether this client and its connection(s) are valid
:rtype: bool
"""
raise NotImplementedError()
def uri(self, path: str, *additional_paths: typing.List[str]) -> str:
r"""Makes a complete URI from the given path(s).
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:return: the complete, standardized URI including the base URI, e.g. ``"http://localhost:5000/users"``
:rtype: str
"""
return "/".join([self.base_uri] + _standardize_path(path, *additional_paths))
def _req(self, method: str, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
r"""Makes a :py:mod:`requests` call, checks for errors, and returns the response.
:param method: REST method (``"get"``, ``"post"``, etc.)
:type method: str
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:param \**kwargs: any additional kwargs to send to :py:mod:`requests`
:type \**kwargs: dict
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:return: the :py:mod:`requests` :py:class:`requests.Response` object
:rtype: requests.Response
"""
uri = self.uri(path, *additional_paths)
self.logger.debug("{} {}".format(method.upper(), uri))
if self.session:
resp = self.session.request(method, uri, **kwargs)
else:
resp = requests.request(method, uri, **kwargs)
if not resp.ok:
uri = "\"/" + "/".join(_standardize_path(path, *additional_paths)) + "\""
raise exceptions.PineClientHttpException("{}".format(method.upper()),
"{} {}".format(self.name, uri) if self.name else uri,
resp)
return resp
def get(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
r"""Makes a :py:mod:`requests` ``GET`` call, checks for errors, and returns the response.
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:param \**kwargs: any additional kwargs to send to :py:mod:`requests`
:type \**kwargs: dict
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
:rtype: requests.Response
"""
return self._req("GET", path, *additional_paths, **kwargs)
def put(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
r"""Makes a :py:mod:`requests` ``PUT`` call, checks for errors, and returns the response.
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:param \**kwargs: any additional kwargs to send to :py:mod:`requests`
:type \**kwargs: dict
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
:rtype: requests.Response
"""
return self._req("PUT", path, *additional_paths, **kwargs)
def patch(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
r"""Makes a :py:mod:`requests` ``PATCH`` call, checks for errors, and returns the response.
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:param \**kwargs: any additional kwargs to send to :py:mod:`requests`
:type \**kwargs: dict
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
:rtype: requests.Response
"""
return self._req("PATCH", path, *additional_paths, **kwargs)
def post(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
r"""Makes a :py:mod:`requests` ``POST`` call, checks for errors, and returns the response.
:param path: relative path, e.g. ``"users"``
:type path: str
:param \*additional_paths: any additional path components
:type \*additional_paths: list(str), optional
:param \**kwargs: any additional kwargs to send to :py:mod:`requests`
:type \**kwargs: dict
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
:rtype: requests.Response
"""
return self._req("POST", path, *additional_paths, **kwargs)
class EveClient(BaseClient):
"""A client to access Eve and, optionally, its underlying MongoDB instance.
"""
DEFAULT_DBNAME: str = "pmap_nlp"
"""The default DB name used by PINE.
:type: str
"""
def __init__(self, eve_base_uri: str, mongo_base_uri: str = None, mongo_dbname: str = DEFAULT_DBNAME):
"""Constructor.
:param eve_base_uri: the base URI for the eve server, e.g. ``"http://localhost:5001"``
:type eve_base_uri: str
:param mongo_base_uri: the base URI for the mongodb server, e.g. ``"mongodb://localhost:27018"``, defaults to ``None``
:type mongo_base_uri: str, optional
:param mongo_dbname: the DB name that PINE uses, defaults to ``"pmap_nlp"``
:type mongo_dbname: str, optional
"""
super().__init__(eve_base_uri, name="eve")
self.mongo_base_uri: str = mongo_base_uri
"""The base URI for the MongoDB server.
:type: str
"""
self.mongo: pymongo.MongoClient = pymongo.MongoClient(mongo_base_uri) if mongo_base_uri else None
"""The :py:class:`pymongo.mongo_client.MongoClient` instance.
:type: pymongo.mongo_client.MongoClient
"""
self.mongo_db: pymongo.database.Database = self.mongo[mongo_dbname] if self.mongo and mongo_dbname else None
"""The :py:class:`pymongo.database.Database` instance.
:type: pymongo.database.Database
"""
@overrides
def is_valid(self) -> bool:
if self.mongo_base_uri:
try:
if not pymongo.MongoClient(self.mongo_base_uri, serverSelectionTimeoutMS=1).server_info():
self.logger.error("Unable to connect to MongoDB")
return False
except:
self.logger.error("Unable to connect to MongoDB", exc_info=True)
return False
try:
self.ping()
except:
self.logger.error("Unable to ping eve", exc_info=True)
return False
return True
def ping(self) -> typing.Any:
"""Pings the eve server and returns the result.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the JSON response from the server (probably ``"pong"``)
"""
return self.get("system/ping").json()
def about(self) -> dict:
"""Returns the 'about' dict from the server.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the JSON response from the server
:rtype: dict
"""
return self.get("about").json()
def get_resource(self, resource: str, resource_id: str) -> dict:
"""Gets a resource from eve by its ID.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the JSON object response from the server
:rtype: dict
"""
return self.get(resource, resource_id).json()
def _add_or_replace_resource(self, resource: str, obj: dict, valid_fn: typing.Callable[[dict, typing.Callable[[str], None]], bool] = None) -> str:
"""Adds or replaces the given resource.
:param resource: the resource type, e.g. ``"users"``
:type resource: str
:param obj: the resource object
:type obj: dict
:param valid_fn: a function to validate the resource object, defaults to ``None``
:type valid_fn: function, optional
:raises exceptions.PineClientValueException: if a valid_fn is passed in and the object fails
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the ID of the added/replaced resource
:rtype: str
"""
if valid_fn and not valid_fn(obj):
raise exceptions.PineClientValueException(obj, resource)
if models.ID_FIELD in obj:
try:
res = self.get_resource(resource, obj[models.ID_FIELD])
except exceptions.PineClientHttpException as e:
if e.resp.status_code == 404:
return self.post(resource, obj).json()[models.ID_FIELD]
else:
raise e
return self.put(resource, obj[models.ID_FIELD], json=obj, headers={"If-Match": res["_etag"]}).json()[models.ID_FIELD]
else:
return self.post(resource, obj).json()[models.ID_FIELD]
def _add_resources(self, resource: str, objs: typing.List[dict], valid_fn: typing.Callable[[dict, typing.Callable[[str], None]], bool] = None, replace_if_exists: bool = False):
"""Tries to add all the resource objects at once, optionally falling back to individual replacement if that fails.
:param resource: the resource type, e.g. ``"users"``
:type resource: str
:param objs: the resource objects
:type objs: list(dict)
:param valid_fn: a function to validate the resource object, defaults to ``None``
:type valid_fn: function, optional
:param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
:type replace_if_exists: bool, optional
:raises exceptions.PineClientValueException: if a valid_fn is passed in and any of the objects fails
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the IDs of the added resources
:rtype: list(str)
"""
if objs == None:
return []
if valid_fn:
for obj in objs:
if not valid_fn(obj, self.logger.warn):
raise exceptions.PineClientValueException(obj, resource)
try:
resp = self.post(resource, json=objs)
return [item[models.ID_FIELD] for item in resp.json()[models.ITEMS_FIELD]]
except exceptions.PineClientHttpException as e:
if e.resp.status_code == 409 and replace_if_exists:
return [self.add_or_replace_resource(resource, obj, valid_fn) for obj in objs]
else:
raise e
def add_users(self, users: typing.List[dict], replace_if_exists=False) -> typing.List[str]:
"""Adds the given users.
:param users: the user objects
:type users: list(dict)
:param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
:type replace_if_exists: bool, optional
:raises exceptions.PineClientValueException: if any of the user objects are not valid, see :py:func:`.models.is_valid_eve_user`
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the IDs of the added users
:rtype: list(str)
"""
for user in users:
if "password" in user:
user["passwdhash"] = password.hash_password(user["password"])
del user["password"]
return self._add_resources("users", users, valid_fn=models.is_valid_eve_user, replace_if_exists=replace_if_exists)
def get_users(self):
"""Gets all users.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: all the users
:rtype: list(dict)
"""
return self.get("users").json()[models.ITEMS_FIELD]
def add_pipelines(self, pipelines: typing.List[dict], replace_if_exists=False) -> typing.List[str]:
"""Adds the given pipelines.
:param pipelines: the pipeline objects
:type pipelines: list(dict)
:param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
:type replace_if_exists: bool, optional
:raises exceptions.PineClientValueException: if any of the pipeline objects are not valid, see :py:func:`.models.is_valid_eve_pipeline`
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the IDs of the added pipelines
:rtype: list(str)
"""
return self._add_resources("pipelines", pipelines, valid_fn=models.is_valid_eve_pipeline, replace_if_exists=replace_if_exists)
class PineClient(BaseClient):
"""A client to access PINE (more specifically: the backend).
"""
def __init__(self, backend_base_uri: str):
"""Constructor.
:param backend_base_uri: the base URI for the backend server, e.g. ``"http://localhost:5000"``
:type backend_base_uri: str
"""
super().__init__(backend_base_uri)
@overrides
def is_valid(self) -> bool:
try:
self.ping()
return True
except:
self.logger.error("Unable to ping PINE backend", exc_info=True)
return False
def ping(self) -> typing.Any:
"""Pings the backend server and returns the result.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the JSON response from the server (probably ``"pong"``)
"""
return self.get("ping").json()
def about(self) -> dict:
"""Returns the 'about' dict from the server.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the JSON response from the server
:rtype: dict
"""
return self.get("about").json()
def get_logged_in_user(self) -> dict:
"""Returns the currently logged in user, or None if not logged in.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: currently logged in user, or None if not logged in
:rtype: dict
"""
return self.get(["auth", "logged_in_user"]).json()
def get_my_user_id(self) -> str:
"""Returns the ID of the logged in user, or None if not logged in.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the ID of the logged in user, or None if not logged in
:rtype: str
"""
u = self.get_logged_in_user()
return u["id"] if u and "id" in u else None
def is_logged_in(self) -> bool:
"""Returns whether the user is currently logged in or not.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: whether the user is currently logged in or not
:rtype: bool
"""
return self.session and self.get_logged_in_user()
def _check_login(self):
"""Checks whether user is logged in and raises an :py:class:`.exceptions.PineClientAuthException` if not.
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
"""
if not self.is_logged_in():
raise exceptions.PineClientAuthException("User is not logged in.")
def get_auth_module(self) -> str:
"""Returns the PINE authentication module, e.g. ``"eve"``.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the PINE authentication module, e.g. ``"eve"``
:rtype: str
"""
return self.get(["auth", "module"]).json()
def login_eve(self, username: str, password: str) -> bool:
"""Logs in using eve credentials, and returns whether it was successful.
:param username: username
:type username: str
:param password: password
:type password: str
:raises exceptions.PineClientAuthException: if auth module is not eve or login was not successful
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: whether the login was successful
:rtype: bool
"""
if self.get_auth_module() != "eve":
raise exceptions.PineClientAuthException("Auth module is not eve.")
if self.session:
self.logout()
self.session = requests.Session()
try:
self.post(["auth", "login"], json={
"username": username,
"password": password
})
return True
except exceptions.PineClientHttpException as e:
self.session.close()
self.session = None
if e.resp.status_code == 401:
raise exceptions.PineClientAuthException("Login failed for {}".format(username), cause=e)
else:
raise e
def logout(self):
"""Logs out the current user.
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
"""
if self.is_logged_in():
self.post(["auth", "logout"])
if self.session:
self.session.close()
self.session = None
def get_pipelines(self) -> typing.List[dict]:
"""Returns all pipelines accessible to logged in user.
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: all pipelines accessible to logged in user
:rtype: list(dict)
"""
self._check_login()
return self.get("pipelines").json()[models.ITEMS_FIELD]
def collection_builder(self, **kwargs: dict) -> models.CollectionBuilder:
r"""Makes and returns a new :py:class:`.models.CollectionBuilder` with the logged in user.
:param \**kwargs: any additional args to pass in to the constructor
:type \**kwargs: dict
:returns: a new :py:class:`.models.CollectionBuilder` with the logged in user
:rtype: models.CollectionBuilder
"""
return models.CollectionBuilder(creator_id=self.get_my_user_id(), **kwargs)
def create_collection(self, builder: models.CollectionBuilder) -> str:
"""Creates a collection using the current value of the given builder and returns its ID.
:param builder: collection builder
:type builder: models.CollectionBuilder
:raises exceptions.PineClientValueException: if the given collection is not valid, see :py:func:`.models.is_valid_collection`
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the created collection's ID
:rtype: str
"""
self._check_login()
if builder == None or not isinstance(builder, models.CollectionBuilder):
raise exceptions.PineClientValueException(builder, "CollectionBuilder")
if not builder.is_valid(self.logger.warn):
raise exceptions.PineClientValueException(builder, "collection")
return self.post("collections", data=builder.form_json, files=builder.files).json()[models.ID_FIELD]
def get_collection_documents(self, collection_id: str, truncate: bool, truncate_length: int = 30) -> typing.List[dict]:
"""Returns the documents in the given collection.
:param collection_id: the ID of the collection
:type collection_id: str
:param truncate: whether to truncate the document text (a good idea unless you need it)
:type truncate: bool
:param truncate_length: how many characters of the text you want if truncated, defaults to ``30``
:type truncate_length: int, optional
:returns: the documents in the given collection
:rtype: list(dict)
"""
return self.get(["documents", "by_collection_id", collection_id], params={
"truncate": json.dumps(truncate),
"truncateLength": json.dumps(truncate_length)
}).json()["_items"]
def add_document(self, document: dict = {}, creator_id: str = None, collection_id: str = None,
overlap: int = None, text: str = None, metadata: dict = None) -> str:
"""Adds a new document to a collection and returns its ID.
Will use the logged in user ID for the creator_id if none is given. Although all the
parameters are optional, you must provide values either in the document or through the
kwargs in order to make a valid document.
:param document: optional document dict, will be overridden with any kwargs, defaults to ``{}``
:type document: dict, optional
:param creator_id: optional creator_id for the document, defaults to ``None`` (not set)
:type creator_id: str, optional
:param collection_id: optional collection_id for the document, defaults to ``None`` (not set)
:type collection_id: str, optional
:param overlap: optional overlap for the document, defaults to ``None`` (not set)
:type overlap: int, optional
:param text: optional text for the document, defaults to ``None`` (not set)
:type text: str, optional
:param metadata: optional metadata for the document, defaults to ``None`` (not set)
:type metadata: dict, optional
:raises exceptions.PineClientValueException: if the given document parameters are not valid, see :py:func:`.models.is_valid_eve_document`
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the created document's ID
:rtype: str
"""
self._check_login()
user_id = self.get_my_user_id()
if document == None or not isinstance(document, dict):
document = {}
if creator_id:
document["creator_id"] = creator_id
elif not "creator_id" in document:
document["creator_id"] = user_id
if collection_id:
document["collection_id"] = collection_id
if overlap != None:
document["overlap"] = overlap
if text:
document["text"] = text
if metadata != None:
document["metadata"] = metadata
if not models.is_valid_eve_document(document, self.logger.warn):
raise exceptions.PineClientValueException(document, "documents")
return self.post("documents", json=document).json()[models.ID_FIELD]
def add_documents(self, documents: typing.List[dict], creator_id: str = None, collection_id: str = None) -> typing.List[str]:
"""Adds multiple documents at once and returns their IDs.
Will use the logged in user ID for the creator_id if none is given.
:param documents: the documents to add
:type documents: list(dict)
:param creator_id: optional creator_id to set in the documents, defaults to ``None`` (not set)
:type creator_id: str, optional
:param collection_id: optional collection_id to set in the documents, defaults to ``None`` (not set)
:type collection_id: str, optional
:raises exceptions.PineClientValueException: if any of the given documents are not valid, see :py:func:`.models.is_valid_eve_document`
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the created documents' IDs
:rtype: list(str)
"""
self._check_login()
user_id = self.get_my_user_id()
if documents == None or (not isinstance(documents, list) and not isinstance(documents, tuple)):
raise exceptions.PineClientValueException(documents, "documents")
for document in documents:
if creator_id:
document["creator_id"] = creator_id
elif "creator_id" not in document or not document["creator_id"]:
document["creator_id"] = user_id
if collection_id:
document["collection_id"] = collection_id
if not models.is_valid_eve_document(document, self.logger.warn):
raise exceptions.PineClientValueException(document, "documents")
return [doc["_id"] for doc in self.post("documents", json=documents).json()[models.ITEMS_FIELD]]
def annotate_document(self, document_id: str, doc_annotations: typing.List[str], ner_annotations: typing.List[typing.Union[dict, list, tuple]]) -> str:
"""Annotates the given document with the given values.
:param document_id: the document ID to annotate
:type document_id: str
:param doc_annotations: document annotations/labels
:type doc_annotations: list(str)
:param ner_annotations: NER annotations, where each annotation is either a list or a dict
:type ner_annotations: list
:raises exceptions.PineClientValueException: if any of the given annotations are not valid, see :py:func:`.models.is_valid_annotation`
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the annotation ID
:rtype: str
"""
self._check_login()
if not document_id or not isinstance(document_id, str):
raise exceptions.PineClientValueException(document_id, "str")
body = {
"doc": doc_annotations,
"ner": ner_annotations
}
if not models.is_valid_annotation(body, self.logger.warn):
raise exceptions.PineClientValueException(body, "annotation")
return self.post(["annotations", "mine", "by_document_id", document_id], json=body).json()
def annotate_collection_documents(self, collection_id: str, document_annotations: dict, skip_document_updates=False) -> typing.List[str]:
"""Annotates documents in a collection.
:param collection_id: the ID of the collection containing the documents to annotate
:type collection_id: str
:param document_annotations: a dict containing "ner" list and "doc" list
:type document_annotations: dict
:param skip_document_updates: whether to skip updating the document "has_annotated" map, defaults to ``False``.
This should only be ``True`` if you properly set the
"has_annotated" map when you created the document.
:type skip_document_updates: bool
:raises exceptions.PineClientValueException: if any of the given annotations are not valid, see :py:func:`.models.is_valid_doc_annotations`
:raises exceptions.PineClientAuthException: if not logged in
:raises exceptions.PineClientHttpException: if the HTTP request returns an error
:returns: the annotation IDs
:rtype: list(str)
"""
self._check_login()
if not models.is_valid_doc_annotations(document_annotations, self.logger.warn):
raise exceptions.PineClientValueException(document_annotations, "document_annotations")
return self.post(["annotations", "mine", "by_collection_id", collection_id],
json=document_annotations,
params={"skip_document_updates":json.dumps(skip_document_updates)}).json()
class LocalPineClient(PineClient):
"""A client for a local PINE instance, including an :py:class<.EveClient>.
"""
def __init__(self, backend_base_uri: str, eve_base_uri: str, mongo_base_uri: str = None, mongo_dbname: str = EveClient.DEFAULT_DBNAME):
"""Constructor.
:param backend_base_uri: the base URI for the backend server, e.g. ``"http://localhost:5000"``
:type backend_base_uri: str
:param eve_base_uri: the base URI for the eve server, e.g. ``"http://localhost:5001"``
:type eve_base_uri: str
:param mongo_base_uri: the base URI for the mongodb server, e.g. ``"mongodb://localhost:27018"``, defaults to ``None``
:type mongo_base_uri: str, optional
:param mongo_dbname: the DB name that PINE uses, defaults to ``"pmap_nlp"``
:type mongo_dbname: str, optional
"""
super().__init__(backend_base_uri)
self.eve: EveClient = EveClient(eve_base_uri, mongo_base_uri, mongo_dbname=mongo_dbname)
"""The local :py:class:`EveClient` instance.
:type: EveClient
"""
@overrides
def is_valid(self) -> bool:
return super().is_valid() and self.eve.is_valid()

View File

@@ -0,0 +1,65 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""PINE client exceptions module.
"""
import requests
class PineClientException(Exception):
"""Base class for PINE client exceptions.
"""
def __init__(self, message: str, cause: Exception = None):
"""Constructor.
:param message: the message
:type message: str
:param cause: optional cause, defaults to ``None``
:type cause: Exception, optional
"""
super().__init__(message)
if cause:
self.__cause__ = cause
self.message = message
"""The message.
:type: str
"""
class PineClientHttpException(PineClientException):
"""A PINE client exception caused by an underlying HTTP exception.
"""
def __init__(self, method: str, path: str, resp: requests.Response):
"""Constructor.
:param method: the REST method (``"get"``, ``"post"``, etc.)
:type method: str
:param path: the human-readable path that caused the exception
:type path: str
:param resp: the :py:class:`Response <requests.Response>` with the error info
:type resp: requests.Response
"""
super().__init__("HTTP error with {} to {}: {} ({})".format(method, path, resp.status_code, resp.reason))
self.resp = resp
"""The :py:class:`Response <requests.Response>` with the error info
:type: requests.Response
"""
class PineClientValueException(PineClientException):
"""A PINE client exception caused by passing invalid data.
"""
def __init__(self, obj: dict, obj_type: str):
"""Constructor.
:param obj: the error data
:type obj: dict
:param obj_type: human-readable type of object
:type obj_type: str
"""
super().__init__("Object is not a valid type of {}".format(obj_type))
class PineClientAuthException(PineClientException):
pass

28
client/pine/client/log.py Normal file
View File

@@ -0,0 +1,28 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import logging.config
import os
# make sure this package has been installed
import pythonjsonlogger
CONFIG_FILE_ENV: str = "PINE_LOGGING_CONFIG_FILE"
"""The environment variable that optionally contains the file to use for logging configuration.
:type: str
"""
def setup_logging():
"""Sets up logging, if configured to do so.
The environment variable named by :py:data:`CONFIG_FILE_ENV` is checked and, if present, is
passed to :py:func:`logging.config.dictConfig`.
"""
if CONFIG_FILE_ENV not in os.environ:
return
file = os.environ[CONFIG_FILE_ENV]
if os.path.isfile(file):
with open(file, "r") as f:
logging.config.dictConfig(json.load(f))
logging.getLogger(__name__).info("Set logging configuration from file {}".format(file))

View File

@@ -0,0 +1,937 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import io
import json
import typing
ID_FIELD: str = "_id"
"""The field used to store database ID.
:type: str
"""
ITEMS_FIELD: str = "_items"
"""The field used to access the items in a multi-item database response.
:type: str
"""
def _check_field_required_bool(obj: dict, field: str) -> bool:
"""Checks that the given field is in the object and is a bool.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: whether the given field is in the object and is a bool
:rtype: bool
"""
return field in obj and isinstance(obj[field], bool)
def _check_field_int(obj: dict, field: str) -> bool:
"""Checks that if the given field is in the object, that it is an int.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: if the given field is in the object, that it is an int
:rtype: bool
"""
return field not in obj or (obj[field] != None and isinstance(obj[field], int))
def _check_field_required_int(obj: dict, field: str) -> bool:
"""Checks that the given field is in the object and is an int.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: whether the given field is in the object and is an int
:rtype: bool
"""
return field in obj and obj[field] != None and isinstance(obj[field], int)
def _check_field_float(obj: dict, field: str) -> bool:
"""Checks that if the given field is in the object, that it is a float.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: if the given field is in the object, that it is a float
:rtype: bool
"""
return field not in obj or (object[field] != None and (isinstance(obj[field], float) or isinstance(obj[field], int)))
def _check_field_required_float(obj: dict, field: str) -> bool:
"""Checks that the given field is in the object and is a float.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: whether the given field is in the object and is a float
:rtype: bool
"""
return field in obj and obj[field] != None and (isinstance(obj[field], float) or isinstance(obj[field], int))
def _check_field_string(obj: dict, field: str) -> bool:
"""Checks that if the given field is in the object, that it is a string.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: if the given field is in the object, that it is a string
:rtype: bool
"""
return field not in obj or isinstance(obj[field], str)
def _check_field_required_string(obj: dict, field: str) -> bool:
"""Checks that the given field is in the object and is a string.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: whether the given field is in the object and is a string
:rtype: bool
"""
return field in obj and obj[field] != None and isinstance(obj[field], str) and len(obj[field].strip()) != 0
def _check_field_string_list(obj: dict, field: str, min_length: int = 0) -> bool:
"""Checks that if the given field is in the object, that it is a string list.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:param min_length: the minimum length of the list (if > 0), defaults to 0
:type min_length: int, optional
:returns: if the given field is in the object, that it is a string list
:rtype: bool
"""
if field not in obj:
return True
if not isinstance(obj[field], list) and not isinstance(obj[field], tuple):
return False
if min_length > 0 and len(obj[field]) < min_length:
return False
for elem in obj[field]:
if obj == None or not isinstance(elem, str) or len(elem.strip()) == 0:
return False
return True
def _check_field_required_string_list(obj: dict, field: str, min_length: int = 0) -> bool:
"""Checks that the given field is in the object and is a string list.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:param min_length: the minimum length of the list (if > 0), defaults to 0
:type min_length: int, optional
:returns: if the given field is in the object, that it is a string list
:rtype: bool
"""
if field not in obj or obj[field] == None or (not isinstance(obj[field], list) and not isinstance(obj[field], tuple)):
return False
if min_length > 0 and len(obj[field]) < min_length:
return False
for elem in obj[field]:
if obj == None or not isinstance(elem, str) or len(elem.strip()) == 0:
return False
return True
def _check_field_dict(obj: dict, field: str) -> bool:
"""Checks that if the given field is in the object, that it is a dict.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: if the given field is in the object, that it is a dict
:rtype: bool
"""
return field not in obj or isinstance(obj[field], dict)
def _check_field_required_dict(obj: dict, field: str) -> bool:
"""Checks that the given field is in the object and is a dict.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: whether the given field is in the object and is a dict
:rtype: bool
"""
return field in obj and obj[field] != None and isinstance(obj[field], dict)
def _check_field_bool(obj: dict, field: str) -> bool:
"""Checks that if the given field is in the object, that it is a bool.
:param obj: the object to check
:type obj: dict
:param field: the field to check
:type field: str
:returns: if the given field is in the object, that it is a bool
:rtype: bool
"""
return field not in obj or isinstance(obj[field], bool)
####################################################################################################
def is_valid_eve_user(user: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given user object is valid.
A valid user object has an ``_id``, ``firstname``, and ``lastname`` that are non-empty string
fields. If ``email``, ``description``, or ``passwdhash`` are present, they are string fields.
If ``role`` is present, it is a list of strings that are either ``administrator`` or ``user``.
:param user: user object
:type user: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given user object is valid
:rtype: bool
"""
if user == None or not isinstance(user, dict):
if error_callback:
error_callback("Given object is not a dict.")
return False
if not _check_field_required_string(user, ID_FIELD) or \
not _check_field_required_string(user, "firstname") or \
not _check_field_required_string(user, "lastname"):
if error_callback:
error_callback("Given object is missing {}, firstname, or lastname fields.".format(ID_FIELD))
return False
if not _check_field_string(user, "email") or \
not _check_field_string(user, "description") or \
not _check_field_string(user, "passwdhash"):
if error_callback:
error_callback("Fields email, description, or passwd hash are not valid.")
return False
if "role" in user:
if user["role"] == None or (not isinstance(user["role"], list) and not isinstance(user["role"], tuple)):
if error_callback:
error_callback("Field role is not a list.")
return False
for role in user["role"]:
if role == None or not isinstance(role, str) or role not in ["administrator", "user"]:
error_callback("One or mole roles is not valid.")
return False
return True
def is_valid_eve_pipeline(pipeline: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given pipeline object is valid.
A valid pipeline has an ``_id``, ``title``, and ``name`` that are non-empty string fields. If
``description`` is provided, it is a string field. If ``parameters`` are provided, it is a
dict field.
:param pipeline: pipeline object
:type pipeline: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given pipeline object is valid
:rtype: bool
"""
if pipeline == None or not isinstance(pipeline, dict):
if error_callback:
error_callback("Given object is not a dict.")
return False
if not _check_field_required_string(pipeline, ID_FIELD) or \
not _check_field_required_string(pipeline, "title") or \
not _check_field_required_string(pipeline, "name"):
if error_callback:
error_callback("Given object is missing {}, title, or name fields.".format(ID_FIELD))
return False
if not _check_field_string(pipeline, "description"):
if error_callback:
error_callback("Field description is not valid.")
return False
if "parameters" in pipeline and (pipeline["parameters"] == None or not isinstance(pipeline["parameters"], dict)):
if error_callback:
error_callback("Field parameters is not a dict.")
return False
return True
def is_valid_eve_collection(collection: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given collection object is valid.
A valid collection has a ``creator_id`` that is a non-empty string field. It has a ``labels``
that is a non-empty list of strings. If ``annotators`` or ``viewers`` are provided, they are
lists of strings. If ``metadata`` or ``configuration`` are provided, they are dicts. If
``archived`` is provided, it is a bool.
:param collection: collection object
:type collection: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given collection object is valid
:rtype: bool
"""
if collection == None or not isinstance(collection, dict):
if error_callback:
error_callback("Given object is not a dict.")
return False
if not _check_field_required_string(collection, "creator_id") or \
not _check_field_required_string_list(collection, "labels"):
if error_callback:
error_callback("Given object is missing creator_id or labels fields.")
return False
if not _check_field_string_list(collection, "annotators") or \
not _check_field_string_list(collection, "viewers") or \
not _check_field_dict(collection, "metadata") or \
not _check_field_dict(collection, "configuration") or \
not _check_field_bool(collection, "archived"):
if error_callback:
error_callback("Field annotators, viewers, metadata, configuration, or archived is not valid.")
return False
return True
def is_valid_collection(form: dict, files: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given form and files parameters are valid for creating a collection.
A valid form has a ``collection`` that is a dict field and is valid via
:py:func:`.is_valid_eve_collection`. Additionally, the collection has string ``title`` and
``description`` fields in its ``metadata``. It also has at least one element for ``labels``,
``viewers``, and ``annotators``, and the ``creator_id`` must be in both ``viewers`` and
``annotators``.
The form also has ``overlap`` as a float field between 0 and 1 (inclusive), ``train_every`` as
an int field that is at least 5, and ``pipelineId`` as a string field.
If files are provided, file ``file`` and any files starting with ``imageFile`` are checked.
If a file ``file`` is provided, the form must also have a boolean ``csvHasHeader`` field and an
int ``csvTextCol`` field.
:param form: form data to send to backend
:type form: dict
:param files: file data to send to backend
:type files: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given form and files parameters are valid for creating a collection
:rtype: bool
"""
if not form or not isinstance(form, dict) or not _check_field_required_dict(form, "collection"):
if error_callback:
error_callback("Missing or invalid collection.")
return False
collection = form["collection"]
if not is_valid_eve_collection(collection, error_callback=error_callback):
return False
if not _check_field_required_dict(collection, "metadata"):
if error_callback:
error_callback("Missing or invalid metadata.")
return False
md = collection["metadata"]
if not _check_field_required_string(md, "title") or \
not _check_field_required_string(md, "description"):
if error_callback:
error_callback("Missing metadata title or description.")
return False
if not _check_field_required_string_list(collection, "labels", 1) or \
not _check_field_required_string_list(collection, "viewers", 1) or \
not _check_field_required_string_list(collection, "annotators", 1):
if error_callback:
error_callback("Need at least one label, viewer, and annotator.")
return False
if collection["creator_id"] not in collection["viewers"] or collection["creator_id"] not in collection["annotators"]:
if error_callback:
error_callback("Creator ID should be in viewers and annotators.")
return False
if not _check_field_required_float(form, "overlap") or \
not _check_field_required_int(form, "train_every") or \
not _check_field_required_string(form, "pipelineId"):
if error_callback:
error_callback("Missing fields overlap, train_every, or pipelineId.")
return False
if form["overlap"] < 0 or form["overlap"] > 1:
if error_callback:
error_callback("Field overlap must be between 0 and 1.")
return False
if form["train_every"] < 5:
if error_callback:
error_callback("Field train_every must be >= 5.")
return False
if files:
for key in files:
if key == "file":
if not _check_field_required_bool(form, "csvHasHeader") or \
not _check_field_required_int(form, "csvTextCol"):
if error_callback:
error_callback("Missing fields csvHasHeader or csvTextCol.")
return False
if not isinstance(files[key], io.IOBase):
if error_callback:
error_callback("File {} is not an open file.".format(key))
return False
elif key.startswith("imageFile"):
if not isinstance(files[key], io.IOBase):
if error_callback:
error_callback("File {} is not an open file.".format(key))
return False
return True
def is_valid_eve_document(document: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given document object is valid.
A valid document has a ``creator_id`` and ``collection_id`` that are non-empty string fields.
Optionally, it may have an int ``overlap`` field, string ``text``field, and dict
``metadata`` and ``has_annotated`` fields.
:param document: document object
:type document: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given document object is valid
:rtype: bool
"""
if document == None or not isinstance(document, dict):
if error_callback:
error_callback("Given object is not a dict.")
return False
if not _check_field_required_string(document, "creator_id") or \
not _check_field_required_string(document, "collection_id"):
if error_callback:
error_callback("Missing required string fields creator_id and collection_id.")
return False
if not _check_field_int(document, "overlap") or \
not _check_field_string(document, "text") or \
not _check_field_dict(document, "metadata") or \
not _check_field_dict(document, "has_annotated"):
if error_callback:
error_callback("Invalid fields overlap, text, metadata, or has_annotated.")
return False
return True
def is_valid_doc_annotation(ann: typing.Any, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given annotation is a valid document label/annotation.
This means that it is a non-empty string.
:param ann: annotation
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given annotation is a valid document label/annotation
:rtype: bool
"""
if not ann or not isinstance(ann, str):
if error_callback:
error_callback("Doc annotation is not a string.")
return False
if len(ann.strip()) == 0:
if error_callback:
error_callback("Doc annotation is empty.")
return False
return True
def is_valid_ner_annotation(ann: typing.Any, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given annotation is a valid document NER annotation.
Valid NER annotations take one of two forms: a :py:class:`dict` or a
:py:class:`list`/:py:class:`tuple` of size 3.
A valid NER :py:class:`dict` has the following fields:
* ``start``: an :py:class:`int` that is >= 0
* ``end``: an :py:class:`int` that is >= 0
* ``label``: a non-empty :py:class:`str`
A valid NER :py:class:`list`/:py:class:`tuple` has the following elements:
* element ``0``: an :py:class:`int` that is >= 0
* element ``1``: an :py:class:`int` that is >= 0
* element ``2``: a non-empty :py:class:`str`
:param ann: annotation
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given annotation is a valid document label/annotation
:rtype: bool
"""
if isinstance(ann, dict):
if not "start" in ann or not ann["start"] or not isinstance(ann["start"], int) or ann["start"] < 0:
if error_callback:
error_callback("Field start is not valid ({}).".format(ann["start"] if "start" in ann else None))
return False
if not "end" in ann or not ann["end"] or not isinstance(ann["end"], int) or ann["end"] < 0:
if error_callback:
error_callback("Field end is not valid ({}).".format(ann["end"] if "end" in ann else None))
return False
if not "label" in ann or not ann["label"] or not isinstance(ann["label"], str) or len(ann["label"].strip()) == 0:
if error_callback:
error_callback("Field label is not valid ({}).".format(ann["label"] if "label" in ann else None))
return False
elif isinstance(ann, list) or isinstance(ann, tuple):
if len(ann) != 3:
if error_callback:
error_callback("Annotation length is not 3.")
return False
if not isinstance(ann[0], int) or ann[0] < 0 or not isinstance(ann[1], int) or ann[1] < 0 or \
not isinstance(ann[2], str) or len(ann[2].strip()) == 0:
if error_callback:
error_callback("Annotation's first element must be int, second element must be int, third element must be string.")
return False
else:
if error_callback:
error_callback("Doc annotation is not a dict or list({}).".format(type(ann)))
return False
return True
def is_valid_annotation(body: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given body is valid to create an annotation.
A valid body is a :py:class:`dict` with two fields:
* ``doc``: a list of valid doc annotations (see :py:func:`.is_valid_doc_annotation`)
* ``ner``: a list of valid NER annotations (see :py:func:`.is_valid_ner_annotation`)
:param body: annotation body
:type body: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given body is valid to create an annotation
:rtype: bool
"""
if not body or not isinstance(body, dict):
if error_callback:
error_callback("Body is not a dict ({}).".format(type(body)))
return False
if not _check_field_required_string_list(body, "doc"):
if error_callback:
error_callback("Missing string list field doc.")
return False
for ann in body["doc"]:
if not is_valid_doc_annotation(ann, error_callback=error_callback):
return False
if "ner" not in body or (not isinstance(body["ner"], list) and not isinstance(body["ner"], tuple)):
if error_callback:
error_callback("Invalid NER annotation field ner.")
return False
for ann in body["ner"]:
if not is_valid_ner_annotation(ann, error_callback=error_callback):
return False
return True
def is_valid_doc_annotations(doc_annotations: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
"""Checks whether the given document annotations are valid.
A valid document annotations object is a :py:class:`dict`, where the keys are :py:class:`str`
document IDs, and the values are valid annotation bodies (see :py:func:`.is_valid_annotation`).
:param doc_annotations: document annotations
:type body: dict
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the given body is valid to create an annotation
:rtype: bool
"""
if not doc_annotations or not isinstance(doc_annotations, dict):
if error_callback:
error_callback("Annotations is not a dict ({}).".format(type(doc_annotations)))
return False
for doc_id in doc_annotations:
if not doc_id or not isinstance(doc_id, str) or len(doc_id.strip()) == 0:
if error_callback:
error_callback("Document ID is not valid ({}).".format(doc_id))
return False
annotations = doc_annotations[doc_id]
if not is_valid_annotation(annotations, error_callback=error_callback):
return False
return True
####################################################################################################
class CollectionBuilder(object):
"""A class that can build the form and files fields that are necessary to create a collection.
"""
def __init__(self,
collection: dict = None,
creator_id: str = None,
viewers: typing.List[str] = None,
annotators: typing.List[str] = None,
labels: typing.List[str] = None,
title: str = None,
description: str = None,
allow_overlapping_ner_annotations: bool = None,
pipelineId: str = None,
overlap: float = None,
train_every: int = None,
classifierParameters: dict = None,
document_csv_file: str = None,
document_csv_file_has_header: bool = None,
document_csv_file_text_column: int = None,
image_files: typing.List[str] = None):
"""Constructor.
:param collection: starting parameters for the collection, defaults to ``None`` (not set)
:type collection: dict, optional
:param creator_id: user ID for the creator, see :py:meth:`.creator_id`, defaults to ``None`` (not set)
:type creator_id: str, optional
:param viewers: viewer IDs for the collection, see :py:meth:`.viewer`, defaults to ``None`` (not set)
:type viewers: list(str), optional
:param annotators: annotator IDs for the collection, see :py:meth:`.annotator`, defaults to ``None`` (not set)
:type annotators: list(str), optional
:param labels: labels for the collection, see :py:meth:`.label`, defaults to ``None`` (not set)
:type labels: list(str), optional
:param title: metadata title, see :py:meth:`.title`, defaults to ``None`` (not set)
:type title: str, optional
:param description: metadata description, see :py:meth:`.description`, defaults to ``None`` (not set)
:type description: str, optional
:param allow_overlapping_ner_annotations: optional configuration for allowing overlapping NER
annotations, see :py:meth:`.allow_overlapping_ner_annotations`,
defaults to ``None`` (not set)
:type allow_overlapping_ner_annotations: bool
:param pipelineId: the ID of the pipeline from which to create the classifier,
see :py:meth:`.classifier`, defaults to ``None`` (not set)
:type pipelineId: str, optional
:param overlap: the classifier overlap, see :py:meth:`.classifier`, defaults to ``None`` (not set)
:type overlap: float, optional
:param train_every: train the model after this many documents are annotated,
see :py:meth:`.classifier`, defaults to ``None`` (not set)
:type train_every: int, optional
:param classifierParameters: any parameters to pass to the classifier,
see :py:meth:`.classifier`, defaults to ``None`` (not set)
:type classifierParameters: dict, optional
:param document_csv_file: the filename of the local document CSV file,
see :py:meth:`.document_csv_File`, defaults to ``None`` (not set)
:type document_csv_file: str, optional
:param document_csv_file_has_header: whether the document CSV file has a header,
see :py:meth:`.document_csv_File`, defaults to ``None`` (not set)
:type document_csv_file_has_header: bool, optional
:param document_csv_file_text_column: if the document CSV file has headers, the document text
can be found in this column index (the others are used for
document metadata), see :py:meth:`.document_csv_File`,
defaults to ``None`` (not set)
:type document_csv_file_text_column: int, optional
:param image_files: any image files to add to the collection, see :py:meth:`.image_file`,
defaults to ``None`` (not set)
:type image_files: list(str)
"""
self.form = {
"collection": {
"creator_id": None,
"metadata": {
"title": None,
"description": None
},
"configuration": {
},
"labels": None,
"viewers": None,
"annotators": None
}
}
"""The form data.
:type: dict
"""
if collection:
self.form["collection"].update(collection)
self.files = {}
"""The files data.
:type: dict
"""
self._image_file_counter = 0
if creator_id:
self.creator_id(creator_id)
if viewers:
for viewer in viewers: self.viewer(viewer)
if annotators:
for annotator in annotator: self.annotator(annotator)
if labels:
for label in labels: self.label(label)
if title:
self.title(title)
if description:
self.description(description)
if allow_overlapping_ner_annotations != None:
self.allow_overlapping_ner_annotations(allow_overlapping_ner_annotations)
if document_csv_file and document_csv_file_has_header != None and document_csv_file_text_column != None:
self.document_csv_file(document_csv_file, document_csv_file_has_header, document_csv_file_text_column)
if image_files:
for image_file in image_files: self.image_file(image_file)
if pipelineId:
kwargs = {}
if overlap != None: kwargs["overlap"] = overlap
if train_every != None: kwargs["train_every"] = train_every
if classifierParameters != None: kwargs["classifierParameters"] = classifierParameters
self.classifier(pipelineId, **kwargs)
@property
def collection(self) -> dict:
"""Returns the collection information from the form.
:returns: collection information from the form
:rtype: dict
"""
return self.form["collection"]
@property
def form_json(self) -> dict:
"""Returns the form where the values have been JSON-encoded.
:returns: the form where the values have been JSON-encoded
:rtype: dict
"""
return {key: json.dumps(value) for (key, value) in self.form.items()}
def creator_id(self, user_id: str) -> "CollectionBuilder":
"""Sets the creator_id to the given, and adds to viewers and annotators.
:param user_id: the user ID to use for the creator_id
:type user_id: str
:returns: self
:rtype: models.CollectionBuilder
"""
self.collection["creator_id"] = user_id
self.viewer(user_id)
self.annotator(user_id)
return self
def viewer(self, user_id: str) -> "CollectionBuilder":
"""Adds the given user to the list of viewers.
:param user_id: the user ID to add as a viewer
:type user_id: str
:returns: self
:rtype: models.CollectionBuilder
"""
if not self.collection["viewers"]:
self.collection["viewers"] = [user_id]
elif user_id not in self.collection["viewers"]:
self.collection["viewers"].append(user_id)
return self
def annotator(self, user_id: str) -> "CollectionBuilder":
"""Adds the given user to the list of annotators.
:param user_id: the user ID to add as an annotator
:type user_id: str
:returns: self
:rtype: models.CollectionBuilder
"""
if not self.collection["annotators"]:
self.collection["annotators"] = [user_id]
elif user_id not in self.collection["annotators"]:
self.collection["annotators"].append(user_id)
return self
def label(self, label: str) -> "CollectionBuilder":
"""Adds the given label to the collection.
:param label: label to add
:type label: str
:returns: self
:rtype: models.CollectionBuilder
"""
if not self.collection["labels"]:
self.collection["labels"] = [label]
elif label not in self.collection["labels"]:
self.collection["labels"].append(label)
return self
def metadata(self, key: str, value: typing.Any) -> "CollectionBuilder":
"""Adds the given metadata key/value to the collection.
:param key: metadata key
:type key: str
:param value: metadata value
:returns: self
:rtype: models.CollectionBuilder
"""
self.collection["metadata"][key] = value
return self
def title(self, title: str) -> "CollectionBuilder":
"""Sets the metadata title to the given.
:param title: collection title
:type title: str
:returns: self
:rtype: models.CollectionBuilder
"""
return self.metadata("title", title)
def description(self, description: str) -> "CollectionBuilder":
"""Sets the metadata description to the given.
:param description: collection description
:type description: str
:returns: self
:rtype: models.CollectionBuilder
"""
return self.metadata("description", description)
def configuration(self, key: str, value: typing.Any) -> "CollectionBuilder":
"""Adds the given configuration key/value to the collection.
:param key: configuration key
:type key: str
:param value: configuration value
:returns: self
:rtype: models.CollectionBuilder
"""
self.collection["configuration"][key] = value
return self
def allow_overlapping_ner_annotations(self, allow_overlapping_ner_annotations: bool):
"""Sets the configuration value for allow_overlapping_ner_annotations to the given.
:param allow_overlapping_ner_annotations: whether to allow overlapping NER annotations
:type allow_overlapping_ner_annotations: bool
:returns: self
:rtype: models.CollectionBuilder
"""
return self.configuration("allow_overlapping_ner_annotations", allow_overlapping_ner_annotations)
def classifier(self, pipelineId: str, overlap: float = 0, train_every: int = 100, classifierParameters: dict = {}) -> "CollectionBuilder":
"""Sets classifier information for the created collection.
:param pipelineId: the ID of the pipeline from which to create the classifier
:type pipelineId: str
:param overlap: the classifier overlap, defaults to `0`
:type overlap: float, optional
:param train_every: train the model after this many documents are annotated, defaults to `100`
:type train_every: int, optional
:param classifierParameters: any parameters to pass to the classifier, defaults to ``{}``
:type classifierParameters: dict, optional
:returns: self
:rtype: models.CollectionBuilder
"""
self.form["pipelineId"] = pipelineId
self.form["overlap"] = overlap
self.form["train_every"] = train_every
self.form["classifierParameters"] = classifierParameters
return self
def document_csv_file(self, csv_filename: str, has_header: bool, text_column: int) -> "CollectionBuilder":
"""Sets the CSV file used to create documents to the given.
May raise an Exception if there is a problem opening the indicated file.
:param csv_filename: the filename of the local CSV file
:type csv_filename: str
:param has_header: whether the CSV file has a header
:type has_header: bool
:param text_column: if the CSV file has headers, the document text can be found in this column index
(the others are used for document metadata)
:type text_column: int
:returns: self
:rtype: models.CollectionBuilder
"""
if "file" in self.files and self.files["file"]:
self.files["file"].close()
self.files["file"] = open(csv_filename, "rb")
self.form["csvHasHeader"] = has_header
self.form["csvTextCol"] = text_column
return self
def image_file(self, image_filename: str) -> "CollectionBuilder":
"""Adds the given image file to the collection.
May raise an Exception if there is a problem opening the indicated file.
:param image_filename: the filename of the local image file
:type image_filename: str
:returns: self
:rtype: models.CollectionBuilder
"""
self.files["imageFile{}".format(self._image_file_counter)] = open(image_filename, "rb")
self._image_file_counter += 1
return self
def is_valid(self, error_callback: typing.Callable[[str], None] = None):
"""Checks whether the currently set values are valid or not.
See :py:func:`.is_valid_collection`.
:param error_callback: optional callback that is called with any error messages, defaults to ``None``
:type error_callback: function, optional
:returns: whether the currently set values are valid or not
:rtype: bool
"""
return is_valid_collection(self.form, self.files, error_callback=error_callback)
####################################################################################################
def remove_eve_fields(obj: dict, remove_timestamps: bool = True, remove_versions: bool = True):
"""Removes fields inserted by eve from the given object.
:param obj: the object
:type obj: dict
:param remove_timestamps: whether to remove the timestamp fields, defaults to ``True``
:type remove_timestamps: bool
:param remove_versions: whether to remove the version fields, defaults to ``True``
:type remove_versions: bool
"""
fields = ["_etag", "_links"]
if remove_timestamps: fields += ["_created", "_updated"]
if remove_versions: fields += ["_version", "_latest_version"]
for f in fields:
if f in obj:
del obj[f]
def remove_nonupdatable_fields(obj: dict):
"""Removes all non-updatable fields from the given object.
These fields would cause a ``PUT``/``PATCH`` to be rejected because they are not user-modifiable.
:param obj: the object
:type obj: dict
"""
remove_eve_fields(obj)

View File

@@ -0,0 +1,33 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import base64
import bcrypt
import hashlib
def hash_password(password: str) -> str:
"""Hashes the given password for use in user object.
:param password: password
:type password: str
:returns: hashed password
:rtype: str
"""
sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
hashed_password_bytes = bcrypt.hashpw(sha256, bcrypt.gensalt())
return base64.b64encode(hashed_password_bytes).decode()
def check_password(password: str, hashed_password: str) -> str:
"""Checks the given password against the given hash.
:param password: password to check
:type password: str
:param hashed_password: hashed password to check against
:type hashed_password: str
:returns: whether the password matches the hash
:rtype: bool
"""
sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
hashed_password_bytes = base64.b64decode(hashed_password.encode())
return bcrypt.checkpw(sha256, hashed_password_bytes)

View File

@@ -5,20 +5,22 @@ services:
backend:
environment:
- AUTH_MODULE=${AUTH_MODULE}
- VEGAS_CLIENT_SECRET
- EVE_SERVER=http://eve:7510
- EVE_SERVER=http://eve:${EVE_PORT}
- REDIS_SERVER=redis
- PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
- PINE_LOGGING_CONFIG_FILE=${PINE_LOGGING_CONFIG_FILE:-/nlp-web-app/shared/logging.python.dev.json}
eve:
build:
args:
- DB_DIR=/nlp-web-app/eve/db
- MONGO_PORT=${MONGO_PORT}
volumes:
- eve_db:/nlp-web-app/eve/db
- ${EVE_DB_VOLUME}:/nlp-web-app/eve/db
environment:
- MONGO_URI=
- PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
- PINE_LOGGING_CONFIG_FILE=${PINE_LOGGING_CONFIG_FILE:-/nlp-web-app/shared/logging.python.dev.json}
frontend_annotation:
build:

16
docker-compose.test.yml Normal file
View File

@@ -0,0 +1,16 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
version: "3"
services:
backend:
ports:
- "${BACKEND_PORT}:${BACKEND_PORT}"
eve:
ports:
- "${EVE_PORT}:${EVE_PORT}"
- "${MONGO_PORT}:${MONGO_PORT}"
volumes:
eve_test_db:

View File

@@ -48,13 +48,14 @@ services:
volumes:
- ${SHARED_VOLUME}:/nlp-web-app/shared
- ${LOGS_VOLUME}:/nlp-web-app/logs
- ${DOCUMENT_IMAGE_VOLUME}:/nlp-web-app/document_images
- ${DOCUMENT_IMAGE_VOLUME}:/mnt/azure
environment:
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AUTH_MODULE: ${AUTH_MODULE}
PINE_LOGGING_CONFIG_FILE: /nlp-web-app/shared/logging.python.json
DOCUMENT_IMAGE_DIR: /nlp-web-app/document_images
DOCUMENT_IMAGE_DIR: /mnt/azure
PINE_VERSION: ${PINE_VERSION:?Please set PINE_VERSION environment variable.}
# Expose the following to test:
# ports:
# - ${BACKEND_PORT}:${BACKEND_PORT}

20
docs/Makefile Normal file
View File

@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

18
docs/Pipfile Normal file
View File

@@ -0,0 +1,18 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
[dev-packages]
sphinx = "*"
sphinx-autoapi = "*"
[scripts]
doc = "make singlehtml html latexpdf LATEXMKOPTS='-silent'"
[requires]
python_version = "3.6"

319
docs/Pipfile.lock generated Normal file
View File

@@ -0,0 +1,319 @@
{
"_meta": {
"hash": {
"sha256": "aab7848f4527a249ac0b2421bb300c9995a4bf089517eabaf28ffd1997fd12a0"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.6"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {},
"develop": {
"alabaster": {
"hashes": [
"sha256:446438bdcca0e05bd45ea2de1668c1d9b032e1a9154c2c259092d77031ddd359",
"sha256:a661d72d58e6ea8a57f7a86e37d86716863ee5e92788398526d58b26a4e4dc02"
],
"version": "==0.7.12"
},
"astroid": {
"hashes": [
"sha256:2f4078c2a41bf377eea06d71c9d2ba4eb8f6b1af2135bec27bbbb7d8f12bb703",
"sha256:bc58d83eb610252fd8de6363e39d4f1d0619c894b0ed24603b881c02e64c7386"
],
"markers": "python_version >= '3'",
"version": "==2.4.2"
},
"babel": {
"hashes": [
"sha256:1aac2ae2d0d8ea368fa90906567f5c08463d98ade155c0c4bfedd6a0f7160e38",
"sha256:d670ea0b10f8b723672d3a6abeb87b565b244da220d76b4dba1b66269ec152d4"
],
"version": "==2.8.0"
},
"certifi": {
"hashes": [
"sha256:5930595817496dd21bb8dc35dad090f1c2cd0adfaf21204bf6732ca5d8ee34d3",
"sha256:8fc0819f1f30ba15bdb34cceffb9ef04d99f420f68eb75d901e9560b8749fc41"
],
"version": "==2020.6.20"
},
"chardet": {
"hashes": [
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
],
"version": "==3.0.4"
},
"docutils": {
"hashes": [
"sha256:0c5b78adfbf7762415433f5515cd5c9e762339e23369dbe8000d84a4bf4ab3af",
"sha256:c2de3a60e9e7d07be26b7f2b00ca0309c207e06c100f9cc2a94931fc75a478fc"
],
"version": "==0.16"
},
"idna": {
"hashes": [
"sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6",
"sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0"
],
"version": "==2.10"
},
"imagesize": {
"hashes": [
"sha256:6965f19a6a2039c7d48bca7dba2473069ff854c36ae6f19d2cde309d998228a1",
"sha256:b1f6b5a4eab1f73479a50fb79fcf729514a900c341d8503d62a62dbc4127a2b1"
],
"version": "==1.2.0"
},
"jinja2": {
"hashes": [
"sha256:89aab215427ef59c34ad58735269eb58b1a5808103067f7bb9d5836c651b3bb0",
"sha256:f0a4641d3cf955324a89c04f3d94663aa4d638abe8f733ecd3582848e1c37035"
],
"version": "==2.11.2"
},
"lazy-object-proxy": {
"hashes": [
"sha256:0c4b206227a8097f05c4dbdd323c50edf81f15db3b8dc064d08c62d37e1a504d",
"sha256:194d092e6f246b906e8f70884e620e459fc54db3259e60cf69a4d66c3fda3449",
"sha256:1be7e4c9f96948003609aa6c974ae59830a6baecc5376c25c92d7d697e684c08",
"sha256:4677f594e474c91da97f489fea5b7daa17b5517190899cf213697e48d3902f5a",
"sha256:48dab84ebd4831077b150572aec802f303117c8cc5c871e182447281ebf3ac50",
"sha256:5541cada25cd173702dbd99f8e22434105456314462326f06dba3e180f203dfd",
"sha256:59f79fef100b09564bc2df42ea2d8d21a64fdcda64979c0fa3db7bdaabaf6239",
"sha256:8d859b89baf8ef7f8bc6b00aa20316483d67f0b1cbf422f5b4dc56701c8f2ffb",
"sha256:9254f4358b9b541e3441b007a0ea0764b9d056afdeafc1a5569eee1cc6c1b9ea",
"sha256:9651375199045a358eb6741df3e02a651e0330be090b3bc79f6d0de31a80ec3e",
"sha256:97bb5884f6f1cdce0099f86b907aa41c970c3c672ac8b9c8352789e103cf3156",
"sha256:9b15f3f4c0f35727d3a0fba4b770b3c4ebbb1fa907dbcc046a1d2799f3edd142",
"sha256:a2238e9d1bb71a56cd710611a1614d1194dc10a175c1e08d75e1a7bcc250d442",
"sha256:a6ae12d08c0bf9909ce12385803a543bfe99b95fe01e752536a60af2b7797c62",
"sha256:ca0a928a3ddbc5725be2dd1cf895ec0a254798915fb3a36af0964a0a4149e3db",
"sha256:cb2c7c57005a6804ab66f106ceb8482da55f5314b7fcb06551db1edae4ad1531",
"sha256:d74bb8693bf9cf75ac3b47a54d716bbb1a92648d5f781fc799347cfc95952383",
"sha256:d945239a5639b3ff35b70a88c5f2f491913eb94871780ebfabb2568bd58afc5a",
"sha256:eba7011090323c1dadf18b3b689845fd96a61ba0a1dfbd7f24b921398affc357",
"sha256:efa1909120ce98bbb3777e8b6f92237f5d5c8ea6758efea36a473e1d38f7d3e4",
"sha256:f3900e8a5de27447acbf900b4750b0ddfd7ec1ea7fbaf11dfa911141bc522af0"
],
"version": "==1.4.3"
},
"markupsafe": {
"hashes": [
"sha256:00bc623926325b26bb9605ae9eae8a215691f33cae5df11ca5424f06f2d1f473",
"sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161",
"sha256:09c4b7f37d6c648cb13f9230d847adf22f8171b1ccc4d5682398e77f40309235",
"sha256:1027c282dad077d0bae18be6794e6b6b8c91d58ed8a8d89a89d59693b9131db5",
"sha256:13d3144e1e340870b25e7b10b98d779608c02016d5184cfb9927a9f10c689f42",
"sha256:24982cc2533820871eba85ba648cd53d8623687ff11cbb805be4ff7b4c971aff",
"sha256:29872e92839765e546828bb7754a68c418d927cd064fd4708fab9fe9c8bb116b",
"sha256:43a55c2930bbc139570ac2452adf3d70cdbb3cfe5912c71cdce1c2c6bbd9c5d1",
"sha256:46c99d2de99945ec5cb54f23c8cd5689f6d7177305ebff350a58ce5f8de1669e",
"sha256:500d4957e52ddc3351cabf489e79c91c17f6e0899158447047588650b5e69183",
"sha256:535f6fc4d397c1563d08b88e485c3496cf5784e927af890fb3c3aac7f933ec66",
"sha256:596510de112c685489095da617b5bcbbac7dd6384aeebeda4df6025d0256a81b",
"sha256:62fe6c95e3ec8a7fad637b7f3d372c15ec1caa01ab47926cfdf7a75b40e0eac1",
"sha256:6788b695d50a51edb699cb55e35487e430fa21f1ed838122d722e0ff0ac5ba15",
"sha256:6dd73240d2af64df90aa7c4e7481e23825ea70af4b4922f8ede5b9e35f78a3b1",
"sha256:717ba8fe3ae9cc0006d7c451f0bb265ee07739daf76355d06366154ee68d221e",
"sha256:79855e1c5b8da654cf486b830bd42c06e8780cea587384cf6545b7d9ac013a0b",
"sha256:7c1699dfe0cf8ff607dbdcc1e9b9af1755371f92a68f706051cc8c37d447c905",
"sha256:88e5fcfb52ee7b911e8bb6d6aa2fd21fbecc674eadd44118a9cc3863f938e735",
"sha256:8defac2f2ccd6805ebf65f5eeb132adcf2ab57aa11fdf4c0dd5169a004710e7d",
"sha256:98c7086708b163d425c67c7a91bad6e466bb99d797aa64f965e9d25c12111a5e",
"sha256:9add70b36c5666a2ed02b43b335fe19002ee5235efd4b8a89bfcf9005bebac0d",
"sha256:9bf40443012702a1d2070043cb6291650a0841ece432556f784f004937f0f32c",
"sha256:ade5e387d2ad0d7ebf59146cc00c8044acbd863725f887353a10df825fc8ae21",
"sha256:b00c1de48212e4cc9603895652c5c410df699856a2853135b3967591e4beebc2",
"sha256:b1282f8c00509d99fef04d8ba936b156d419be841854fe901d8ae224c59f0be5",
"sha256:b2051432115498d3562c084a49bba65d97cf251f5a331c64a12ee7e04dacc51b",
"sha256:ba59edeaa2fc6114428f1637ffff42da1e311e29382d81b339c1817d37ec93c6",
"sha256:c8716a48d94b06bb3b2524c2b77e055fb313aeb4ea620c8dd03a105574ba704f",
"sha256:cd5df75523866410809ca100dc9681e301e3c27567cf498077e8551b6d20e42f",
"sha256:cdb132fc825c38e1aeec2c8aa9338310d29d337bebbd7baa06889d09a60a1fa2",
"sha256:e249096428b3ae81b08327a63a485ad0878de3fb939049038579ac0ef61e17e7",
"sha256:e8313f01ba26fbbe36c7be1966a7b7424942f670f38e666995b88d012765b9be"
],
"version": "==1.1.1"
},
"packaging": {
"hashes": [
"sha256:4357f74f47b9c12db93624a82154e9b120fa8293699949152b22065d556079f8",
"sha256:998416ba6962ae7fbd6596850b80e17859a5753ba17c32284f67bfff33784181"
],
"version": "==20.4"
},
"pygments": {
"hashes": [
"sha256:647344a061c249a3b74e230c739f434d7ea4d8b1d5f3721bc0f3558049b38f44",
"sha256:ff7a40b4860b727ab48fad6360eb351cc1b33cbf9b15a0f689ca5353e9463324"
],
"version": "==2.6.1"
},
"pyparsing": {
"hashes": [
"sha256:c203ec8783bf771a155b207279b9bccb8dea02d8f0c9e5f8ead507bc3246ecc1",
"sha256:ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"
],
"version": "==2.4.7"
},
"pytz": {
"hashes": [
"sha256:a494d53b6d39c3c6e44c3bec237336e14305e4f29bbf800b599253057fbb79ed",
"sha256:c35965d010ce31b23eeb663ed3cc8c906275d6be1a34393a1d73a41febf4a048"
],
"version": "==2020.1"
},
"pyyaml": {
"hashes": [
"sha256:06a0d7ba600ce0b2d2fe2e78453a470b5a6e000a985dd4a4e54e436cc36b0e97",
"sha256:240097ff019d7c70a4922b6869d8a86407758333f02203e0fc6ff79c5dcede76",
"sha256:4f4b913ca1a7319b33cfb1369e91e50354d6f07a135f3b901aca02aa95940bd2",
"sha256:69f00dca373f240f842b2931fb2c7e14ddbacd1397d57157a9b005a6a9942648",
"sha256:73f099454b799e05e5ab51423c7bcf361c58d3206fa7b0d555426b1f4d9a3eaf",
"sha256:74809a57b329d6cc0fdccee6318f44b9b8649961fa73144a98735b0aaf029f1f",
"sha256:7739fc0fa8205b3ee8808aea45e968bc90082c10aef6ea95e855e10abf4a37b2",
"sha256:95f71d2af0ff4227885f7a6605c37fd53d3a106fcab511b8860ecca9fcf400ee",
"sha256:b8eac752c5e14d3eca0e6dd9199cd627518cb5ec06add0de9d32baeee6fe645d",
"sha256:cc8955cfbfc7a115fa81d85284ee61147059a753344bc51098f3ccd69b0d7e0c",
"sha256:d13155f591e6fcc1ec3b30685d50bf0711574e2c0dfffd7644babf8b5102ca1a"
],
"version": "==5.3.1"
},
"requests": {
"hashes": [
"sha256:b3559a131db72c33ee969480840fff4bb6dd111de7dd27c8ee1f820f4f00231b",
"sha256:fe75cc94a9443b9246fc7049224f75604b113c36acb93f87b80ed42c44cbb898"
],
"version": "==2.24.0"
},
"six": {
"hashes": [
"sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259",
"sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"
],
"version": "==1.15.0"
},
"snowballstemmer": {
"hashes": [
"sha256:209f257d7533fdb3cb73bdbd24f436239ca3b2fa67d56f6ff88e86be08cc5ef0",
"sha256:df3bac3df4c2c01363f3dd2cfa78cce2840a79b9f1c2d2de9ce8d31683992f52"
],
"version": "==2.0.0"
},
"sphinx": {
"hashes": [
"sha256:97dbf2e31fc5684bb805104b8ad34434ed70e6c588f6896991b2fdfd2bef8c00",
"sha256:b9daeb9b39aa1ffefc2809b43604109825300300b987a24f45976c001ba1a8fd"
],
"index": "pypi",
"version": "==3.1.2"
},
"sphinx-autoapi": {
"hashes": [
"sha256:eb86024fb04f6f1c61d8be73f56db40bf730a932cf0c8d0456a43bae4c11b508",
"sha256:f76ef71d443c6a9ad5e1b326d4dfc196e2080e8b46141c45d1bb47a73a34f190"
],
"index": "pypi",
"version": "==1.4.0"
},
"sphinxcontrib-applehelp": {
"hashes": [
"sha256:806111e5e962be97c29ec4c1e7fe277bfd19e9652fb1a4392105b43e01af885a",
"sha256:a072735ec80e7675e3f432fcae8610ecf509c5f1869d17e2eecff44389cdbc58"
],
"version": "==1.0.2"
},
"sphinxcontrib-devhelp": {
"hashes": [
"sha256:8165223f9a335cc1af7ffe1ed31d2871f325254c0423bc0c4c7cd1c1e4734a2e",
"sha256:ff7f1afa7b9642e7060379360a67e9c41e8f3121f2ce9164266f61b9f4b338e4"
],
"version": "==1.0.2"
},
"sphinxcontrib-htmlhelp": {
"hashes": [
"sha256:3c0bc24a2c41e340ac37c85ced6dafc879ab485c095b1d65d2461ac2f7cca86f",
"sha256:e8f5bb7e31b2dbb25b9cc435c8ab7a79787ebf7f906155729338f3156d93659b"
],
"version": "==1.0.3"
},
"sphinxcontrib-jsmath": {
"hashes": [
"sha256:2ec2eaebfb78f3f2078e73666b1415417a116cc848b72e5172e596c871103178",
"sha256:a9925e4a4587247ed2191a22df5f6970656cb8ca2bd6284309578f2153e0c4b8"
],
"version": "==1.0.1"
},
"sphinxcontrib-qthelp": {
"hashes": [
"sha256:4c33767ee058b70dba89a6fc5c1892c0d57a54be67ddd3e7875a18d14cba5a72",
"sha256:bd9fc24bcb748a8d51fd4ecaade681350aa63009a347a8c14e637895444dfab6"
],
"version": "==1.0.3"
},
"sphinxcontrib-serializinghtml": {
"hashes": [
"sha256:eaa0eccc86e982a9b939b2b82d12cc5d013385ba5eadcc7e4fed23f4405f77bc",
"sha256:f242a81d423f59617a8e5cf16f5d4d74e28ee9a66f9e5b637a18082991db5a9a"
],
"version": "==1.1.4"
},
"typed-ast": {
"hashes": [
"sha256:0666aa36131496aed8f7be0410ff974562ab7eeac11ef351def9ea6fa28f6355",
"sha256:0c2c07682d61a629b68433afb159376e24e5b2fd4641d35424e462169c0a7919",
"sha256:249862707802d40f7f29f6e1aad8d84b5aa9e44552d2cc17384b209f091276aa",
"sha256:24995c843eb0ad11a4527b026b4dde3da70e1f2d8806c99b7b4a7cf491612652",
"sha256:269151951236b0f9a6f04015a9004084a5ab0d5f19b57de779f908621e7d8b75",
"sha256:4083861b0aa07990b619bd7ddc365eb7fa4b817e99cf5f8d9cf21a42780f6e01",
"sha256:498b0f36cc7054c1fead3d7fc59d2150f4d5c6c56ba7fb150c013fbc683a8d2d",
"sha256:4e3e5da80ccbebfff202a67bf900d081906c358ccc3d5e3c8aea42fdfdfd51c1",
"sha256:6daac9731f172c2a22ade6ed0c00197ee7cc1221aa84cfdf9c31defeb059a907",
"sha256:715ff2f2df46121071622063fc7543d9b1fd19ebfc4f5c8895af64a77a8c852c",
"sha256:73d785a950fc82dd2a25897d525d003f6378d1cb23ab305578394694202a58c3",
"sha256:8c8aaad94455178e3187ab22c8b01a3837f8ee50e09cf31f1ba129eb293ec30b",
"sha256:8ce678dbaf790dbdb3eba24056d5364fb45944f33553dd5869b7580cdbb83614",
"sha256:aaee9905aee35ba5905cfb3c62f3e83b3bec7b39413f0a7f19be4e547ea01ebb",
"sha256:bcd3b13b56ea479b3650b82cabd6b5343a625b0ced5429e4ccad28a8973f301b",
"sha256:c9e348e02e4d2b4a8b2eedb48210430658df6951fa484e59de33ff773fbd4b41",
"sha256:d205b1b46085271b4e15f670058ce182bd1199e56b317bf2ec004b6a44f911f6",
"sha256:d43943ef777f9a1c42bf4e552ba23ac77a6351de620aa9acf64ad54933ad4d34",
"sha256:d5d33e9e7af3b34a40dc05f498939f0ebf187f07c385fd58d591c533ad8562fe",
"sha256:fc0fea399acb12edbf8a628ba8d2312f583bdbdb3335635db062fa98cf71fca4",
"sha256:fe460b922ec15dd205595c9b5b99e2f056fd98ae8f9f56b888e7a17dc2b757e7"
],
"markers": "implementation_name == 'cpython' and python_version < '3.8'",
"version": "==1.4.1"
},
"unidecode": {
"hashes": [
"sha256:1d7a042116536098d05d599ef2b8616759f02985c85b4fef50c78a5aaf10822a",
"sha256:2b6aab710c2a1647e928e36d69c21e76b453cd455f4e2621000e54b2a9b8cce8"
],
"version": "==1.1.1"
},
"urllib3": {
"hashes": [
"sha256:3018294ebefce6572a474f0604c2021e33b3fd8006ecd11d62107a5d2a963527",
"sha256:88206b0eb87e6d677d424843ac5209e3fb9d0190d0ee169599165ec25e9d9115"
],
"version": "==1.25.9"
},
"wrapt": {
"hashes": [
"sha256:b62ffa81fb85f4332a4f609cab4ac40709470da05643a082ec1eb88e6d9b97d7"
],
"version": "==1.12.1"
}
}
}

20
docs/README.md Normal file
View File

@@ -0,0 +1,20 @@
&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
## Developer Environment
Required packages:
* python 3.6
* pipenv
* make
`pipenv install --dev`
### Generating documentation
Make sure you have the proper latex packages installed:
* `sudo apt install latexmk texlive-latex-recommended texlive-fonts-recommended texlive-latex-extra`
Generate through pipenv:
* `pipenv run doc` or use convenience script `../generate_documentation.sh`
Generated documentation can then be found in `./build`.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More