first public commit

This commit is contained in:
Brant Chee
2020-06-19 13:19:50 -04:00
commit 14d79ed796
390 changed files with 1087023 additions and 0 deletions

184
README.md Normal file
View File

@@ -0,0 +1,184 @@
© 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
## Required Resources
Note - download required resources and place in pipelines/pine/pipelines/resources
apache-opennlp-1.9.0
stanford-corenlp-full-2018-02-27
stanford-ner-2018-02-27
These are required to build docker images for active learning
## Configuring Logging
See logging configuration files in `./shared/`. `logging.python.dev.json` is used with the
dev stack; the other files are used in the docker containers.
The docker-compose stack is currently set to bind the `./shared/` directory into the containers
at run-time. This allows for configuration changes of the logging without needing to rebuild
containers, and also allows the python logging config to live in one place instead of spread out
into each container. This is controlled with the `${SHARED_VOLUME}` variable from `.env`.
Log files will be stored in the `${LOGS_VOLUME}` variable from `.env`. Pipeline models files will
be stored in the `${MODELS_VOLUME}` variable from `./env`.
## Development Environment
First, refer to the various README files in the subproject directories for dependencies.
Install the pipenv in pipelines.
Then a dev stack can be run with:
```bash
./run_dev_stack.py
```
You probably also need to update `.env` for `VEGAS_CLIENT_SECRET`, if you are
planning to use that auth module.
The dev stack can be stopped with Ctrl-C.
Sometimes (for me) mongod doesn't start in time or something. If you see a connection
error for mongod, just close it and try it again.
Once the dev stack is up and running, the following ports are accessible:
* `localhost:4200` is the main entrypoint and hosts the web app
* `localhost:5000` hosts the backend
* `localhost:5001` hosts the eve layer
### Using the copyright checking pre-commit hook
The script `pre-commit` is provided as a helpful utility to make sure that new files checked into
the repository contain the copyright text. It is _not_ automatically installed and must be
installed manually:
`ln -s ../../pre-commit .git/hooks/`
This hook greps for the copyright text in new files and gives you the option to abort if it is
not found.
### Clearing the database
First, stop your dev stack. Then `rm -rf eve/db` and start the stack again.
### Importing test data
Once running, test data can be imported with:
```bash
./setup_dev_data.sh
```
### Updating existing data
If there is existing data in the database, it is possible that it needs to be
migrated. To do this, run the following once the system is up and running:
```bash
cd eve/python && python3 update_documents_annnotation_status.py
```
## Docker Environments
The docker environment is run using docker-compose. There are two supported configurations: the
default and the prod configuration.
If desired, edit `.env` to change default variable values. You probably also need to update
`.env` for `VEGAS_CLIENT_SECRET`, if you are planning to use that auth module.
To build the images for DEFAULT configuration:
```bash
docker-compose build
```
To run containers as daemons for DEFAULT configuration (remove -d flag to see logs):
```bash
docker-compose up -d
```
You may also want the `--abort-on-container-exit` flag which will make errors more apparent.
With default settings, the webapp will now be accessible at https://localhost:8888
To watch logs for DEFAULT configuration:
```bash
docker-compose logs -f
```
To bring containers down for DEFAULT configuration:
```bash
docker-compose down
```
### Production Docker Environment
To use the production docker environment instead of the default, simply add
`-f docker-compose.yml -f docker-compose.prod.yml` after the `docker-compose` command, e.g.:
```bash
docker-compose -f docker-compose.yml -f docker-compose.prod.yml build
```
Note that you probably need to update `.env` and add the `MONGO_URI` property.
### Clearing the database
Bring all the containers down. Then do a `docker ps --all` and find the numeric ID of the eve
container and remove it with `docker rm <id>`. Then, remove the two eve volumes with
`docker volume rm nlp_webapp_eve_db` and `docker volume rm nlp_webapp_eve_logs`. Finally, restart
your containers.
### Importing test data
Once the system is up and running:
```bash
./setup_docker_test_data.sh
```
### Updating existing data
If there is existing data in the database, it is possible that it needs to be
migrated. To do this, run the following once the system is up and running:
```bash
docker-compose exec eve python3 python/update_documents_annnotation_status.py
```
### User management using "eve" auth module
Note: these scripts only apply to the "eve" auth module, which stores users
in the eve database. Users in the "vegas" module are managed externally.
Once the system is up and running:
```bash
docker-compose exec backend scripts/data/list_users.sh
```
This script will reset all user passwords to their email:
```bash
docker-compose exec backend scripts/data/reset_user_passwords.sh
```
This script will add a new administrator:
```bash
docker-compose exec backend scripts/data/add_admin.sh <email username> <password>
```
This script will set a single user's password.
```bash
docker-compose exec backend scripts/data/set_user_password.sh <email username> <password>
```
Alternatively, there is an Admin Dashboard through the web interface.
### Collection/Document Images
It is now possible to explore images in the "annotate document" page in the frontend UI. The image
URL is specified in the metadata field with the key `imageUrl`. If the URL starts with a "/" it
is loaded from a special endpoint in the backend that loads from a locally attached volume. For
docker, this volume is controlled by the `DOCUMENT_IMAGE_VOLUME` variable in `.env`. For running
the dev stack, this volume can be found in `./local_data/dev/test_images`.
To upload images outside the UI, the following procedures should be used:
* All images in the collection should be in the directory `<image volume>/by_collection/<collection ID>`.
* Subdirectories (such as for individual documents) are allowed but not mandatory.
* The document metadata `imageUrl` should be set to `/<image path within the collection directory>`.
* For example: an imageUrl of `/image.jpg` would load `/<image volume>/by_collection/<collection ID>/image.jpg`
through the backend.

View File

@@ -0,0 +1,166 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
parameters:
appReleaseName: ""
appTlsSecretName: $(appTlsSecretName)
appUrl: ""
azureContainerRegistry: $(azureContainerRegistry)
azureSubscriptionEndpointForSecrets: $(azureSubscriptionEndpointForSecrets)
deployEnvironment: $(deployEnvironment)
deploymentName: "CONTAINER_DEPLOY"
helmChart: "pine-chart"
imageTag: $(Build.BuildId)
ingressClass: "nginx"
kubeServiceConnection: $(kubeServiceConnection)
namespace: $(namespace)
secrets: []
redisImageName: $(redisImageName)
eveImageName: $(eveImageName)
backendImageName: $(backendImageName)
frontendImageName: $(frontendImageName)
pipelineImageName: $(pipelineImageName)
jobs:
# track deployments on the environment
- deployment: "${{ parameters.deploymentName }}"
pool: Default
# creates an environment if it doesnt exist
environment: ${{ parameters.deployEnvironment }}
strategy:
# default deployment strategy
runOnce:
deploy:
steps:
- task: Bash@3
displayName: Display settings
inputs:
targetType: 'inline'
script: |
echo "appReleaseName: ${{ parameters.appReleaseName }}"
echo "appUrl: ${{ parameters.appUrl }}"
echo "deployEnvironment: ${{ parameters.deployEnvironment }}"
echo "kubeServiceConnection: ${{ parameters.kubeServiceConnection }}"
echo "namespace: ${{ parameters.namespace }}"
echo "ingressClass: ${{ parameters.ingressClass }}"
echo "imageTag: ${{ parameters.imageTag }}"
- bash: |
if [ -z "$APP_RELEASE_NAME" ]; then
echo "##vso[task.logissue type=error;]Missing template parameter \"appReleaseName\""
echo "##vso[task.complete result=Failed;]"
fi
if [ -z "$APP_URL" ]; then
echo "##vso[task.logissue type=error;]Missing template parameter \"appUrl\""
echo "##vso[task.complete result=Failed;]"
fi
if [ -z "$AZURE_SUBSCRIPTION" ]; then
echo "##vso[task.logissue type=error;]Missing variable \"azureSubscriptionEndpointForSecrets\""
echo "##vso[task.complete result=Failed;]"
fi
env:
APP_RELEASE_NAME: ${{ parameters.appReleaseName }}
APP_URL: ${{ parameters.appUrl }}
AZURE_SUBSCRIPTION: ${{ parameters.azureSubscriptionEndpointForSecrets }}
displayName: Check for required parameters
- task: HelmInstaller@1
displayName: 'Install Helm 2.16.1'
inputs:
helmVersionToInstall: 2.16.1
- task: HelmDeploy@0
displayName: 'helm init'
inputs:
connectionType: None
command: init
upgradeTiller: false
arguments: '-c'
tillerNamespace: '${{ parameters.namespace }}'
- task: Bash@3
displayName: "set default overrides"
inputs:
targetType: 'inline'
script: |
#!/bin/bash
echo "Creating pipelineHelmOverrideValues.yml file"
cat > pipelineHelmOverrideValues.yml <<- EOM
fullnameOverride: ${{ parameters.appReleaseName }}
name: ${{ parameters.appReleaseName }}
eve:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.eveImageName }}
tag: ${{ parameters.imageTag }}
redis:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.redisImageName }}
tag: ${{ parameters.imageTag }}
backend:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.backendImageName }}
tag: ${{ parameters.imageTag }}
nlpAnnotation:
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.pipelineImageName }}
tag: ${{ parameters.imageTag }}
frontend:
serverName: ${{ parameters.appUrl }}
image:
repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.frontendImageName }}
tag: ${{ parameters.imageTag }}
namespace: ${{ parameters.namespace }}
ingress:
annotations:
kubernetes.io/ingress.class: ${{ parameters.ingressClass }}
hosts:
- ${{ parameters.appUrl }}
tls:
- hosts:
- ${{ parameters.appUrl }}
secretName: ${{ parameters.appTlsSecretName }}
EOM
echo "File created"
cat pipelineHelmOverrideValues.yml
- ${{ if ne(parameters.secrets, '') }}:
- task: Bash@3
displayName: "Add secrets section to overrides"
inputs:
targetType: 'inline'
script: |
#!/bin/bash
cat >> pipelineHelmOverrideValues.yml <<- EOM
secrets:
EOM
echo "File updated"
- ${{ each secret in parameters.secrets }}:
- task: Bash@3
displayName: "Add secret to overrides"
inputs:
targetType: 'inline'
script: |
#!/bin/bash
cat >> pipelineHelmOverrideValues.yml <<- EOM
${{ secret.key }}:
EOM
echo "File updated"
- ${{ each secretData in secret.value }}:
- task: Bash@3
displayName: "Add secret data to overrides"
inputs:
targetType: 'inline'
script: |
#!/bin/bash
cat >> pipelineHelmOverrideValues.yml <<- EOM
${{ secretData.key }}: ${{ secretData.value }}
EOM
- task: KubernetesManifest@0
displayName: bake
name: 'bake'
inputs:
action: bake
namespace: '${{ parameters.namespace }}'
helmChart: '$(Pipeline.Workspace)/${{ parameters.helmChart }}'
releaseName: ${{ parameters.appReleaseName }}
overrideFiles: 'pipelineHelmOverrideValues.yml'
timeoutInMinutes: 900
- task: KubernetesManifest@0
displayName: deploy
inputs:
kubernetesServiceConnection: '${{ parameters.kubeServiceConnection }}'
namespace: '${{ parameters.namespace }}'
manifests: $(bake.manifestsBundle)

147
azure-pipelines.yml Normal file
View File

@@ -0,0 +1,147 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# Docker
# Build and push an image to Azure Container Registry
# https://docs.microsoft.com/azure/devops/pipelines/languages/docker
trigger:
batch: true
branches:
include:
- master
#- release/*
- develop
## Reference to repository containing common templates and variables
resources:
repositories:
- repository: templates
type: git
name: APPLICATIONS/tdo-azure-devops-templates
variables:
## Import of common variables
- template: variables/pmap-common.yml@templates # Template reference
- group: "nlp_annotator"
# appReleaseName is the name of helm release
- name: appReleaseName
value: "pine"
- name: helmChart
value: "pine-chart"
- name: redisImageName
value: "pine/redis"
- name: eveImageName
value: "pine/eve"
- name: backendImageName
value: "pine/backend"
- name: frontendImageName
value: "pine/frontend"
- name: pipelineImageName
value: "pine/al_pipeline"
stages:
- stage: build_test
displayName: Build and Test
jobs:
- job: build_test
displayName: Build
pool:
vmImage: $(vmImageName)
steps:
- task: Docker@2
displayName: Build and push Pine Redis image to container registry
inputs:
command: buildAndPush
repository: $(redisImageName)
dockerfile: redis/Dockerfile
containerRegistry: $(containerRegistry)
tags: |
$(Build.BuildId)
- task: Docker@2
displayName: Build and push Pine Eve image to container registry
inputs:
command: buildAndPush
repository: $(eveImageName)
dockerfile: eve/Dockerfile
containerRegistry: $(containerRegistry)
tags: |
$(Build.BuildId)
- task: Docker@2
displayName: Build and push Pine Backend image to container registry
inputs:
command: buildAndPush
repository: $(backendImageName)
dockerfile: backend/Dockerfile
containerRegistry: $(containerRegistry)
tags: |
$(Build.BuildId)
- task: Docker@2
displayName: Build and push Pine Frontend image to container registry
inputs:
command: buildAndPush
repository: $(frontendImageName)
dockerfile: frontend/annotation/Dockerfile
containerRegistry: $(containerRegistry)
tags: |
$(Build.BuildId)
- task: Docker@2
displayName: Build and push Pine al_pipeline image to container registry
inputs:
command: buildAndPush
repository: $(pipelineImageName)
dockerfile: pipelines/docker/Dockerfile
buildContext: pipelines/
containerRegistry: $(containerRegistry)
tags: |
$(Build.BuildId)
- task: PublishPipelineArtifact@1
inputs:
targetPath: 'pine-chart'
artifact: 'pine-chart'
publishLocation: 'pipeline'
- stage: deploy_to_dev
displayName: Deploy to dev
condition: and(succeeded(), eq(variables['build.sourceBranch'], 'refs/heads/develop'))
dependsOn: build_test
jobs:
- template: azure-pipeline-templates/deploy.yml # Template reference
parameters:
appReleaseName: $(appReleaseName)
appUrl: "dev-nlpannotator.pm.jh.edu"
deployEnvironment: $(devEnvironment)
kubeServiceConnection: $(devEnvironment)
namespace: $(devNamespace)
imageTag: $(Build.BuildId)
redisImageName: $(redisImageName)
eveImageName: $(eveImageName)
backendImageName: $(backendImageName)
frontendImageName: $(frontendImageName)
pipelineImageName: $(pipelineImageName)
secrets:
backend:
VEGAS_CLIENT_SECRET: $(vegas-client-secret-dev)
eve:
MONGO_URI: $(mongo-uri-dev)
- stage: deploy_to_prod
displayName: Deploy to prod
condition: and(succeeded(), eq(variables['build.sourceBranch'], 'refs/heads/master'))
dependsOn: build_test
jobs:
- template: azure-pipeline-templates/deploy.yml # Template reference
parameters:
appReleaseName: $(appReleaseName)
appUrl: "nlpannotator.pm.jh.edu"
deployEnvironment: $(prodEnvironment)
kubeServiceConnection: $(prodEnvironment)
namespace: $(prodNamespace)
imageTag: $(Build.BuildId)
redisImageName: $(redisImageName)
eveImageName: $(eveImageName)
backendImageName: $(backendImageName)
frontendImageName: $(frontendImageName)
pipelineImageName: $(pipelineImageName)
secrets:
backend:
VEGAS_CLIENT_SECRET: $(vegas-client-secret-prod)
eve:
MONGO_URI: $(mongo-uri-prod)

1
backend/.dockerignore Normal file
View File

@@ -0,0 +1 @@
instance/

4
backend/.gitignore vendored Normal file
View File

@@ -0,0 +1,4 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# Flask
/instance/

43
backend/Dockerfile Normal file
View File

@@ -0,0 +1,43 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
FROM ubuntu:18.04
ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get clean && \
apt-get -y update && \
apt-get -y install software-properties-common
RUN apt-get -y update && \
apt-get -y install git build-essential python3.6 python3-pip gettext-base && \
pip3 install --upgrade pip gunicorn pipenv
ARG ROOT_DIR=/nlp-web-app/backend
ARG REDIS_PORT=6379
ARG PORT=7520
ARG WORKERS=5
EXPOSE $PORT
ENV REDIS_PORT $REDIS_PORT
RUN mkdir -p $ROOT_DIR
ADD Pipfile $ROOT_DIR
ADD Pipfile.lock $ROOT_DIR
WORKDIR $ROOT_DIR
RUN pipenv install --system --deploy
ADD pine/ $ROOT_DIR/pine/
ADD scripts/ $ROOT_DIR/scripts/
ADD docker/wsgi.py $ROOT_DIR/
ADD docker_run.sh $ROOT_DIR/
COPY docker/config.py.template ./
RUN PORT=$PORT WORKERS=$WORKERS envsubst '${PORT} ${WORKERS}' < ./config.py.template > ./config.py
CMD ["./docker_run.sh"]

30
backend/Pipfile Normal file
View File

@@ -0,0 +1,30 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
flask = "*"
overrides = "*"
flask-cors = "*"
requests = "*"
bcrypt = "*"
redis = "*"
aioredis = "*"
six = "*"
munch = "*"
pebble = "*"
pydash = "*"
pyjwt = "*"
authlib = "*"
matplotlib = "*"
scipy = "*"
tabulate = "*"
multiprocessing-logging = "*"
python-json-logger = "*"
[dev-packages]
[requires]
python_version = "3.6"

487
backend/Pipfile.lock generated Normal file
View File

@@ -0,0 +1,487 @@
{
"_meta": {
"hash": {
"sha256": "3e01ed7cf96e7f5b79661ba4fa72636cc91faa32c7e887dd109331c16190f90b"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.6"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"aioredis": {
"hashes": [
"sha256:15f8af30b044c771aee6787e5ec24694c048184c7b9e54c3b60c750a4b93273a",
"sha256:b61808d7e97b7cd5a92ed574937a079c9387fdadd22bfbfa7ad2fd319ecc26e3"
],
"index": "pypi",
"version": "==1.3.1"
},
"async-timeout": {
"hashes": [
"sha256:0c3c816a028d47f659d6ff5c745cb2acf1f966da1fe5c19c77a70282b25f4c5f",
"sha256:4291ca197d287d274d0b6cb5d6f8f8f82d434ed288f962539ff18cc9012f9ea3"
],
"version": "==3.0.1"
},
"authlib": {
"hashes": [
"sha256:89d55b14362f8acee450f9d153645e438e3a38be99b599190718c4406f575b05",
"sha256:b6d3f59f609d352bff26dce2c7969cff7204213fae1c21742037b7aa8d7360a6"
],
"index": "pypi",
"version": "==0.14.1"
},
"bcrypt": {
"hashes": [
"sha256:0258f143f3de96b7c14f762c770f5fc56ccd72f8a1857a451c1cd9a655d9ac89",
"sha256:0b0069c752ec14172c5f78208f1863d7ad6755a6fae6fe76ec2c80d13be41e42",
"sha256:19a4b72a6ae5bb467fea018b825f0a7d917789bcfe893e53f15c92805d187294",
"sha256:5432dd7b34107ae8ed6c10a71b4397f1c853bd39a4d6ffa7e35f40584cffd161",
"sha256:6305557019906466fc42dbc53b46da004e72fd7a551c044a827e572c82191752",
"sha256:69361315039878c0680be456640f8705d76cb4a3a3fe1e057e0f261b74be4b31",
"sha256:6fe49a60b25b584e2f4ef175b29d3a83ba63b3a4df1b4c0605b826668d1b6be5",
"sha256:74a015102e877d0ccd02cdeaa18b32aa7273746914a6c5d0456dd442cb65b99c",
"sha256:763669a367869786bb4c8fcf731f4175775a5b43f070f50f46f0b59da45375d0",
"sha256:8b10acde4e1919d6015e1df86d4c217d3b5b01bb7744c36113ea43d529e1c3de",
"sha256:9fe92406c857409b70a38729dbdf6578caf9228de0aef5bc44f859ffe971a39e",
"sha256:a190f2a5dbbdbff4b74e3103cef44344bc30e61255beb27310e2aec407766052",
"sha256:a595c12c618119255c90deb4b046e1ca3bcfad64667c43d1166f2b04bc72db09",
"sha256:c9457fa5c121e94a58d6505cadca8bed1c64444b83b3204928a866ca2e599105",
"sha256:cb93f6b2ab0f6853550b74e051d297c27a638719753eb9ff66d1e4072be67133",
"sha256:ce4e4f0deb51d38b1611a27f330426154f2980e66582dc5f438aad38b5f24fc1",
"sha256:d7bdc26475679dd073ba0ed2766445bb5b20ca4793ca0db32b399dccc6bc84b7",
"sha256:ff032765bb8716d9387fd5376d987a937254b0619eff0972779515b5c98820bc"
],
"index": "pypi",
"version": "==3.1.7"
},
"certifi": {
"hashes": [
"sha256:1d987a998c75633c40847cc966fcf5904906c920a7f17ef374f5aa4282abd304",
"sha256:51fcb31174be6e6664c5f69e3e1691a2d72a1a12e90f872cbdb1567eb47b6519"
],
"version": "==2020.4.5.1"
},
"cffi": {
"hashes": [
"sha256:001bf3242a1bb04d985d63e138230802c6c8d4db3668fb545fb5005ddf5bb5ff",
"sha256:00789914be39dffba161cfc5be31b55775de5ba2235fe49aa28c148236c4e06b",
"sha256:028a579fc9aed3af38f4892bdcc7390508adabc30c6af4a6e4f611b0c680e6ac",
"sha256:14491a910663bf9f13ddf2bc8f60562d6bc5315c1f09c704937ef17293fb85b0",
"sha256:1cae98a7054b5c9391eb3249b86e0e99ab1e02bb0cc0575da191aedadbdf4384",
"sha256:2089ed025da3919d2e75a4d963d008330c96751127dd6f73c8dc0c65041b4c26",
"sha256:2d384f4a127a15ba701207f7639d94106693b6cd64173d6c8988e2c25f3ac2b6",
"sha256:337d448e5a725bba2d8293c48d9353fc68d0e9e4088d62a9571def317797522b",
"sha256:399aed636c7d3749bbed55bc907c3288cb43c65c4389964ad5ff849b6370603e",
"sha256:3b911c2dbd4f423b4c4fcca138cadde747abdb20d196c4a48708b8a2d32b16dd",
"sha256:3d311bcc4a41408cf5854f06ef2c5cab88f9fded37a3b95936c9879c1640d4c2",
"sha256:62ae9af2d069ea2698bf536dcfe1e4eed9090211dbaafeeedf5cb6c41b352f66",
"sha256:66e41db66b47d0d8672d8ed2708ba91b2f2524ece3dee48b5dfb36be8c2f21dc",
"sha256:675686925a9fb403edba0114db74e741d8181683dcf216be697d208857e04ca8",
"sha256:7e63cbcf2429a8dbfe48dcc2322d5f2220b77b2e17b7ba023d6166d84655da55",
"sha256:8a6c688fefb4e1cd56feb6c511984a6c4f7ec7d2a1ff31a10254f3c817054ae4",
"sha256:8c0ffc886aea5df6a1762d0019e9cb05f825d0eec1f520c51be9d198701daee5",
"sha256:95cd16d3dee553f882540c1ffe331d085c9e629499ceadfbda4d4fde635f4b7d",
"sha256:99f748a7e71ff382613b4e1acc0ac83bf7ad167fb3802e35e90d9763daba4d78",
"sha256:b8c78301cefcf5fd914aad35d3c04c2b21ce8629b5e4f4e45ae6812e461910fa",
"sha256:c420917b188a5582a56d8b93bdd8e0f6eca08c84ff623a4c16e809152cd35793",
"sha256:c43866529f2f06fe0edc6246eb4faa34f03fe88b64a0a9a942561c8e22f4b71f",
"sha256:cab50b8c2250b46fe738c77dbd25ce017d5e6fb35d3407606e7a4180656a5a6a",
"sha256:cef128cb4d5e0b3493f058f10ce32365972c554572ff821e175dbc6f8ff6924f",
"sha256:cf16e3cf6c0a5fdd9bc10c21687e19d29ad1fe863372b5543deaec1039581a30",
"sha256:e56c744aa6ff427a607763346e4170629caf7e48ead6921745986db3692f987f",
"sha256:e577934fc5f8779c554639376beeaa5657d54349096ef24abe8c74c5d9c117c3",
"sha256:f2b0fa0c01d8a0c7483afd9f31d7ecf2d71760ca24499c8697aeb5ca37dc090c"
],
"version": "==1.14.0"
},
"chardet": {
"hashes": [
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
],
"version": "==3.0.4"
},
"click": {
"hashes": [
"sha256:8a18b4ea89d8820c5d0c7da8a64b2c324b4dabb695804dbfea19b9be9d88c0cc",
"sha256:e345d143d80bf5ee7534056164e5e112ea5e22716bbb1ce727941f4c8b471b9a"
],
"version": "==7.1.1"
},
"cryptography": {
"hashes": [
"sha256:0cacd3ef5c604b8e5f59bf2582c076c98a37fe206b31430d0cd08138aff0986e",
"sha256:192ca04a36852a994ef21df13cca4d822adbbdc9d5009c0f96f1d2929e375d4f",
"sha256:19ae795137682a9778892fb4390c07811828b173741bce91e30f899424b3934d",
"sha256:1b9b535d6b55936a79dbe4990b64bb16048f48747c76c29713fea8c50eca2acf",
"sha256:2a2ad24d43398d89f92209289f15265107928f22a8d10385f70def7a698d6a02",
"sha256:3be7a5722d5bfe69894d3f7bbed15547b17619f3a88a318aab2e37f457524164",
"sha256:49870684da168b90110bbaf86140d4681032c5e6a2461adc7afdd93be5634216",
"sha256:587f98ce27ac4547177a0c6fe0986b8736058daffe9160dcf5f1bd411b7fbaa1",
"sha256:5aca6f00b2f42546b9bdf11a69f248d1881212ce5b9e2618b04935b87f6f82a1",
"sha256:6b744039b55988519cc183149cceb573189b3e46e16ccf6f8c46798bb767c9dc",
"sha256:6b91cab3841b4c7cb70e4db1697c69f036c8bc0a253edc0baa6783154f1301e4",
"sha256:7598974f6879a338c785c513e7c5a4329fbc58b9f6b9a6305035fca5b1076552",
"sha256:7a279f33a081d436e90e91d1a7c338553c04e464de1c9302311a5e7e4b746088",
"sha256:95e1296e0157361fe2f5f0ed307fd31f94b0ca13372e3673fa95095a627636a1",
"sha256:9fc9da390e98cb6975eadf251b6e5fa088820141061bf041cd5c72deba1dc526",
"sha256:cc20316e3f5a6b582fc3b029d8dc03aabeb645acfcb7fc1d9848841a33265748",
"sha256:d1bf5a1a0d60c7f9a78e448adcb99aa101f3f9588b16708044638881be15d6bc",
"sha256:ed1d0760c7e46436ec90834d6f10477ff09475c692ed1695329d324b2c5cd547",
"sha256:ef9a55013676907df6c9d7dd943eb1770d014f68beaa7e73250fb43c759f4585"
],
"version": "==2.9"
},
"cycler": {
"hashes": [
"sha256:1d8a5ae1ff6c5cf9b93e8811e581232ad8920aeec647c37316ceac982b08cb2d",
"sha256:cd7b2d1018258d7247a71425e9f26463dfb444d411c39569972f4ce586b0c9d8"
],
"version": "==0.10.0"
},
"flask": {
"hashes": [
"sha256:4efa1ae2d7c9865af48986de8aeb8504bf32c7f3d6fdc9353d34b21f4b127060",
"sha256:8a4fdd8936eba2512e9c85df320a37e694c93945b33ef33c89946a340a238557"
],
"index": "pypi",
"version": "==1.1.2"
},
"flask-cors": {
"hashes": [
"sha256:72170423eb4612f0847318afff8c247b38bd516b7737adfc10d1c2cdbb382d16",
"sha256:f4d97201660e6bbcff2d89d082b5b6d31abee04b1b3003ee073a6fd25ad1d69a"
],
"index": "pypi",
"version": "==3.0.8"
},
"hiredis": {
"hashes": [
"sha256:01b577f84c20ecc9c07fc4c184231b08e3c3942de096fa99978e053de231c423",
"sha256:01ff0900134166961c9e339df77c33b72f7edc5cb41739f0babcd9faa345926e",
"sha256:03ed34a13316d0c34213c4fd46e0fa3a5299073f4d4f08e93fed8c2108b399b3",
"sha256:040436e91df5143aff9e0debb49530d0b17a6bd52200ce568621c31ef581b10d",
"sha256:091eb38fbf968d1c5b703e412bbbd25f43a7967d8400842cee33a5a07b33c27b",
"sha256:102f9b9dc6ed57feb3a7c9bdf7e71cb7c278fe8df1edfcfe896bc3e0c2be9447",
"sha256:2b4b392c7e3082860c8371fab3ae762139090f9115819e12d9f56060f9ede05d",
"sha256:2c9cc0b986397b833073f466e6b9e9c70d1d4dc2c2c1b3e9cae3a23102ff296c",
"sha256:2fa65a9df683bca72073cd77709ddeb289ea2b114d3775d225fbbcc5faf808c5",
"sha256:38437a681f17c975fd22349e72c29bc643f8e7eb2d6dc5df419eac59afa4d7ce",
"sha256:3b3428fa3cf1ee178807b52c9bee8950ab94cd4eaa9bfae8c1bbae3c49501d34",
"sha256:3dd8c2fae7f5494978facb0e93297dd627b1a3f536f3b070cf0a7d9157a07dcb",
"sha256:4414a96c212e732723b5c3d7c04d386ebbb2ec359e1de646322cbc3f875cbd0d",
"sha256:48c627581ad4ef60adbac980981407939acf13a0e18f093502c7b542223c4f19",
"sha256:4a60e71625a2d78d8ab84dfb2fa2cfd9458c964b6e6c04fea76d9ade153fb371",
"sha256:585ace09f434e43d8a8dbeb366865b1a044d7c06319b3c7372a0a00e63b860f4",
"sha256:74b364b3f06c9cf0a53f7df611045bc9437ed972a283fa1f0b12537236d23ddc",
"sha256:75c65c3850e89e9daa68d1b9bedd5806f177d60aa5a7b0953b4829481cfc1f72",
"sha256:7f052de8bf744730a9120dbdc67bfeb7605a01f69fb8e7ba5c475af33c24e145",
"sha256:8113a7d5e87ecf57cd4ae263cc9e429adb9a3e59f5a7768da5d3312a8d0a051a",
"sha256:84857ce239eb8ed191ac78e77ff65d52902f00f30f4ee83bf80eb71da73b70e6",
"sha256:8644a48ddc4a40b3e3a6b9443f396c2ee353afb2d45656c4fc68d04a82e8e3f7",
"sha256:936aa565e673536e8a211e43ec43197406f24cd1f290138bd143765079c8ba00",
"sha256:9afeb88c67bbc663b9f27385c496da056d06ad87f55df6e393e1516cfecb0461",
"sha256:9d62cc7880110e4f83b0a51d218f465d3095e2751fbddd34e553dbd106a929ff",
"sha256:a1fadd062fc8d647ff39220c57ea2b48c99bb73f18223828ec97f88fc27e7898",
"sha256:a7754a783b1e5d6f627c19d099b178059c62f782ab62b4d8ba165b9fbc2ee34c",
"sha256:aa59dd63bb3f736de4fc2d080114429d5d369dfb3265f771778e8349d67a97a4",
"sha256:ae2ee0992f8de249715435942137843a93db204dd7db1e7cc9bdc5a8436443e8",
"sha256:b36842d7cf32929d568f37ec5b3173b72b2ec6572dec4d6be6ce774762215aee",
"sha256:bcbf9379c553b5facc6c04c1e5569b44b38ff16bcbf354676287698d61ee0c92",
"sha256:cbccbda6f1c62ab460449d9c85fdf24d0d32a6bf45176581151e53cc26a5d910",
"sha256:d0caf98dfb8af395d6732bd16561c0a2458851bea522e39f12f04802dbf6f502",
"sha256:d6456afeddba036def1a36d8a2758eca53202308d83db20ab5d0b66590919627",
"sha256:dbaef9a21a4f10bc281684ee4124f169e62bb533c2a92b55f8c06f64f9af7b8f",
"sha256:dce84916c09aaece006272b37234ae84a8ed13abb3a4d341a23933b8701abfb5",
"sha256:eb8c9c8b9869539d58d60ff4a28373a22514d40495911451343971cb4835b7a9",
"sha256:efc98b14ee3a8595e40b1425e8d42f5fd26f11a7b215a81ef9259068931754f4",
"sha256:fa2dc05b87d97acc1c6ae63f3e0f39eae5246565232484b08db6bf2dc1580678",
"sha256:fe7d6ce9f6a5fbe24f09d95ea93e9c7271abc4e1565da511e1449b107b4d7848"
],
"version": "==1.0.1"
},
"idna": {
"hashes": [
"sha256:7588d1c14ae4c77d74036e8c22ff447b26d0fde8f007354fd48a7814db15b7cb",
"sha256:a068a21ceac8a4d63dbfd964670474107f541babbd2250d61922f029858365fa"
],
"version": "==2.9"
},
"itsdangerous": {
"hashes": [
"sha256:321b033d07f2a4136d3ec762eac9f16a10ccd60f53c0c91af90217ace7ba1f19",
"sha256:b12271b2047cb23eeb98c8b5622e2e5c5e9abd9784a153e9d8ef9cb4dd09d749"
],
"version": "==1.1.0"
},
"jinja2": {
"hashes": [
"sha256:93187ffbc7808079673ef52771baa950426fd664d3aad1d0fa3e95644360e250",
"sha256:b0eaf100007721b5c16c1fc1eecb87409464edc10469ddc9a22a27a99123be49"
],
"version": "==2.11.1"
},
"kiwisolver": {
"hashes": [
"sha256:03662cbd3e6729f341a97dd2690b271e51a67a68322affab12a5b011344b973c",
"sha256:18d749f3e56c0480dccd1714230da0f328e6e4accf188dd4e6884bdd06bf02dd",
"sha256:247800260cd38160c362d211dcaf4ed0f7816afb5efe56544748b21d6ad6d17f",
"sha256:443c2320520eda0a5b930b2725b26f6175ca4453c61f739fef7a5847bd262f74",
"sha256:4eadb361baf3069f278b055e3bb53fa189cea2fd02cb2c353b7a99ebb4477ef1",
"sha256:556da0a5f60f6486ec4969abbc1dd83cf9b5c2deadc8288508e55c0f5f87d29c",
"sha256:603162139684ee56bcd57acc74035fceed7dd8d732f38c0959c8bd157f913fec",
"sha256:60a78858580761fe611d22127868f3dc9f98871e6fdf0a15cc4203ed9ba6179b",
"sha256:7cc095a4661bdd8a5742aaf7c10ea9fac142d76ff1770a0f84394038126d8fc7",
"sha256:c31bc3c8e903d60a1ea31a754c72559398d91b5929fcb329b1c3a3d3f6e72113",
"sha256:c955791d80e464da3b471ab41eb65cf5a40c15ce9b001fdc5bbc241170de58ec",
"sha256:d069ef4b20b1e6b19f790d00097a5d5d2c50871b66d10075dab78938dc2ee2cf",
"sha256:d52b989dc23cdaa92582ceb4af8d5bcc94d74b2c3e64cd6785558ec6a879793e",
"sha256:e586b28354d7b6584d8973656a7954b1c69c93f708c0c07b77884f91640b7657",
"sha256:efcf3397ae1e3c3a4a0a0636542bcad5adad3b1dd3e8e629d0b6e201347176c8",
"sha256:fccefc0d36a38c57b7bd233a9b485e2f1eb71903ca7ad7adacad6c28a56d62d2"
],
"version": "==1.2.0"
},
"markupsafe": {
"hashes": [
"sha256:00bc623926325b26bb9605ae9eae8a215691f33cae5df11ca5424f06f2d1f473",
"sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161",
"sha256:09c4b7f37d6c648cb13f9230d847adf22f8171b1ccc4d5682398e77f40309235",
"sha256:1027c282dad077d0bae18be6794e6b6b8c91d58ed8a8d89a89d59693b9131db5",
"sha256:13d3144e1e340870b25e7b10b98d779608c02016d5184cfb9927a9f10c689f42",
"sha256:24982cc2533820871eba85ba648cd53d8623687ff11cbb805be4ff7b4c971aff",
"sha256:29872e92839765e546828bb7754a68c418d927cd064fd4708fab9fe9c8bb116b",
"sha256:43a55c2930bbc139570ac2452adf3d70cdbb3cfe5912c71cdce1c2c6bbd9c5d1",
"sha256:46c99d2de99945ec5cb54f23c8cd5689f6d7177305ebff350a58ce5f8de1669e",
"sha256:500d4957e52ddc3351cabf489e79c91c17f6e0899158447047588650b5e69183",
"sha256:535f6fc4d397c1563d08b88e485c3496cf5784e927af890fb3c3aac7f933ec66",
"sha256:596510de112c685489095da617b5bcbbac7dd6384aeebeda4df6025d0256a81b",
"sha256:62fe6c95e3ec8a7fad637b7f3d372c15ec1caa01ab47926cfdf7a75b40e0eac1",
"sha256:6788b695d50a51edb699cb55e35487e430fa21f1ed838122d722e0ff0ac5ba15",
"sha256:6dd73240d2af64df90aa7c4e7481e23825ea70af4b4922f8ede5b9e35f78a3b1",
"sha256:717ba8fe3ae9cc0006d7c451f0bb265ee07739daf76355d06366154ee68d221e",
"sha256:79855e1c5b8da654cf486b830bd42c06e8780cea587384cf6545b7d9ac013a0b",
"sha256:7c1699dfe0cf8ff607dbdcc1e9b9af1755371f92a68f706051cc8c37d447c905",
"sha256:88e5fcfb52ee7b911e8bb6d6aa2fd21fbecc674eadd44118a9cc3863f938e735",
"sha256:8defac2f2ccd6805ebf65f5eeb132adcf2ab57aa11fdf4c0dd5169a004710e7d",
"sha256:98c7086708b163d425c67c7a91bad6e466bb99d797aa64f965e9d25c12111a5e",
"sha256:9add70b36c5666a2ed02b43b335fe19002ee5235efd4b8a89bfcf9005bebac0d",
"sha256:9bf40443012702a1d2070043cb6291650a0841ece432556f784f004937f0f32c",
"sha256:ade5e387d2ad0d7ebf59146cc00c8044acbd863725f887353a10df825fc8ae21",
"sha256:b00c1de48212e4cc9603895652c5c410df699856a2853135b3967591e4beebc2",
"sha256:b1282f8c00509d99fef04d8ba936b156d419be841854fe901d8ae224c59f0be5",
"sha256:b2051432115498d3562c084a49bba65d97cf251f5a331c64a12ee7e04dacc51b",
"sha256:ba59edeaa2fc6114428f1637ffff42da1e311e29382d81b339c1817d37ec93c6",
"sha256:c8716a48d94b06bb3b2524c2b77e055fb313aeb4ea620c8dd03a105574ba704f",
"sha256:cd5df75523866410809ca100dc9681e301e3c27567cf498077e8551b6d20e42f",
"sha256:cdb132fc825c38e1aeec2c8aa9338310d29d337bebbd7baa06889d09a60a1fa2",
"sha256:e249096428b3ae81b08327a63a485ad0878de3fb939049038579ac0ef61e17e7",
"sha256:e8313f01ba26fbbe36c7be1966a7b7424942f670f38e666995b88d012765b9be"
],
"version": "==1.1.1"
},
"matplotlib": {
"hashes": [
"sha256:2466d4dddeb0f5666fd1e6736cc5287a4f9f7ae6c1a9e0779deff798b28e1d35",
"sha256:282b3fc8023c4365bad924d1bb442ddc565c2d1635f210b700722776da466ca3",
"sha256:4bb50ee4755271a2017b070984bcb788d483a8ce3132fab68393d1555b62d4ba",
"sha256:56d3147714da5c7ac4bc452d041e70e0e0b07c763f604110bd4e2527f320b86d",
"sha256:7a9baefad265907c6f0b037c8c35a10cf437f7708c27415a5513cf09ac6d6ddd",
"sha256:aae7d107dc37b4bb72dcc45f70394e6df2e5e92ac4079761aacd0e2ad1d3b1f7",
"sha256:af14e77829c5b5d5be11858d042d6f2459878f8e296228c7ea13ec1fd308eb68",
"sha256:c1cf735970b7cd424502719b44288b21089863aaaab099f55e0283a721aaf781",
"sha256:ce378047902b7a05546b6485b14df77b2ff207a0054e60c10b5680132090c8ee",
"sha256:d35891a86a4388b6965c2d527b9a9f9e657d9e110b0575ca8a24ba0d4e34b8fc",
"sha256:e06304686209331f99640642dee08781a9d55c6e32abb45ed54f021f46ccae47",
"sha256:e20ba7fb37d4647ac38f3c6d8672dd8b62451ee16173a0711b37ba0ce42bf37d",
"sha256:f4412241e32d0f8d3713b68d3ca6430190a5e8a7c070f1c07d7833d8c5264398",
"sha256:ffe2f9cdcea1086fc414e82f42271ecf1976700b8edd16ca9d376189c6d93aee"
],
"index": "pypi",
"version": "==3.2.1"
},
"multiprocessing-logging": {
"hashes": [
"sha256:9d3eb0f1f859b7ba6250a029726f77a7701999deda939595122d8748751de2e3"
],
"index": "pypi",
"version": "==0.3.1"
},
"munch": {
"hashes": [
"sha256:2d735f6f24d4dba3417fa448cae40c6e896ec1fdab6cdb5e6510999758a4dbd2",
"sha256:6f44af89a2ce4ed04ff8de41f70b226b984db10a91dcc7b9ac2efc1c77022fdd"
],
"index": "pypi",
"version": "==2.5.0"
},
"numpy": {
"hashes": [
"sha256:1598a6de323508cfeed6b7cd6c4efb43324f4692e20d1f76e1feec7f59013448",
"sha256:1b0ece94018ae21163d1f651b527156e1f03943b986188dd81bc7e066eae9d1c",
"sha256:2e40be731ad618cb4974d5ba60d373cdf4f1b8dcbf1dcf4d9dff5e212baf69c5",
"sha256:4ba59db1fcc27ea31368af524dcf874d9277f21fd2e1f7f1e2e0c75ee61419ed",
"sha256:59ca9c6592da581a03d42cc4e270732552243dc45e87248aa8d636d53812f6a5",
"sha256:5e0feb76849ca3e83dd396254e47c7dba65b3fa9ed3df67c2556293ae3e16de3",
"sha256:6d205249a0293e62bbb3898c4c2e1ff8a22f98375a34775a259a0523111a8f6c",
"sha256:6fcc5a3990e269f86d388f165a089259893851437b904f422d301cdce4ff25c8",
"sha256:82847f2765835c8e5308f136bc34018d09b49037ec23ecc42b246424c767056b",
"sha256:87902e5c03355335fc5992a74ba0247a70d937f326d852fc613b7f53516c0963",
"sha256:9ab21d1cb156a620d3999dd92f7d1c86824c622873841d6b080ca5495fa10fef",
"sha256:a1baa1dc8ecd88fb2d2a651671a84b9938461e8a8eed13e2f0a812a94084d1fa",
"sha256:a244f7af80dacf21054386539699ce29bcc64796ed9850c99a34b41305630286",
"sha256:a35af656a7ba1d3decdd4fae5322b87277de8ac98b7d9da657d9e212ece76a61",
"sha256:b1fe1a6f3a6f355f6c29789b5927f8bd4f134a4bd9a781099a7c4f66af8850f5",
"sha256:b5ad0adb51b2dee7d0ee75a69e9871e2ddfb061c73ea8bc439376298141f77f5",
"sha256:ba3c7a2814ec8a176bb71f91478293d633c08582119e713a0c5351c0f77698da",
"sha256:cd77d58fb2acf57c1d1ee2835567cd70e6f1835e32090538f17f8a3a99e5e34b",
"sha256:cdb3a70285e8220875e4d2bc394e49b4988bdb1298ffa4e0bd81b2f613be397c",
"sha256:deb529c40c3f1e38d53d5ae6cd077c21f1d49e13afc7936f7f868455e16b64a0",
"sha256:e7894793e6e8540dbeac77c87b489e331947813511108ae097f1715c018b8f3d"
],
"version": "==1.18.2"
},
"overrides": {
"hashes": [
"sha256:2ee4055a686a3ab30621deca01e43562e97825e29b7993e66d73f287d204e868"
],
"index": "pypi",
"version": "==2.8.0"
},
"pebble": {
"hashes": [
"sha256:077b51bcb8726ad9003214fb268fe8c51778503f4c37ffcea9a905ab97e23473",
"sha256:26fdcc0f36d93d8e07559d36b942b7800c6b9622626d5b587ab1a74820d02732"
],
"index": "pypi",
"version": "==4.5.1"
},
"pycparser": {
"hashes": [
"sha256:2d475327684562c3a96cc71adf7dc8c4f0565175cf86b6d7a404ff4c771f15f0",
"sha256:7582ad22678f0fcd81102833f60ef8d0e57288b6b5fb00323d101be910e35705"
],
"version": "==2.20"
},
"pydash": {
"hashes": [
"sha256:a7733886ab811e36510b44ff1de7ccc980327d701fb444a4b2ce395e6f4a4a87",
"sha256:bc9762159c3fd1f822b131a2d9cbb2b2036595a42ad257d2d821b29803d85f7d"
],
"index": "pypi",
"version": "==4.7.6"
},
"pyjwt": {
"hashes": [
"sha256:5c6eca3c2940464d106b99ba83b00c6add741c9becaec087fb7ccdefea71350e",
"sha256:8d59a976fb773f3e6a39c85636357c4f0e242707394cadadd9814f5cbaa20e96"
],
"index": "pypi",
"version": "==1.7.1"
},
"pyparsing": {
"hashes": [
"sha256:c203ec8783bf771a155b207279b9bccb8dea02d8f0c9e5f8ead507bc3246ecc1",
"sha256:ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"
],
"version": "==2.4.7"
},
"python-dateutil": {
"hashes": [
"sha256:73ebfe9dbf22e832286dafa60473e4cd239f8592f699aa5adaf10050e6e1823c",
"sha256:75bb3f31ea686f1197762692a9ee6a7550b59fc6ca3a1f4b5d7e32fb98e2da2a"
],
"version": "==2.8.1"
},
"python-json-logger": {
"hashes": [
"sha256:b7a31162f2a01965a5efb94453ce69230ed208468b0bbc7fdfc56e6d8df2e281"
],
"index": "pypi",
"version": "==0.1.11"
},
"redis": {
"hashes": [
"sha256:0dcfb335921b88a850d461dc255ff4708294943322bd55de6cfd68972490ca1f",
"sha256:b205cffd05ebfd0a468db74f0eedbff8df1a7bfc47521516ade4692991bb0833"
],
"index": "pypi",
"version": "==3.4.1"
},
"requests": {
"hashes": [
"sha256:43999036bfa82904b6af1d99e4882b560e5e2c68e5c4b0aa03b655f3d7d73fee",
"sha256:b3f43d496c6daba4493e7c431722aeb7dbc6288f52a6e04e7b6023b0247817e6"
],
"index": "pypi",
"version": "==2.23.0"
},
"scipy": {
"hashes": [
"sha256:00af72998a46c25bdb5824d2b729e7dabec0c765f9deb0b504f928591f5ff9d4",
"sha256:0902a620a381f101e184a958459b36d3ee50f5effd186db76e131cbefcbb96f7",
"sha256:1e3190466d669d658233e8a583b854f6386dd62d655539b77b3fa25bfb2abb70",
"sha256:2cce3f9847a1a51019e8c5b47620da93950e58ebc611f13e0d11f4980ca5fecb",
"sha256:3092857f36b690a321a662fe5496cb816a7f4eecd875e1d36793d92d3f884073",
"sha256:386086e2972ed2db17cebf88610aab7d7f6e2c0ca30042dc9a89cf18dcc363fa",
"sha256:71eb180f22c49066f25d6df16f8709f215723317cc951d99e54dc88020ea57be",
"sha256:770254a280d741dd3436919d47e35712fb081a6ff8bafc0f319382b954b77802",
"sha256:787cc50cab3020a865640aba3485e9fbd161d4d3b0d03a967df1a2881320512d",
"sha256:8a07760d5c7f3a92e440ad3aedcc98891e915ce857664282ae3c0220f3301eb6",
"sha256:8d3bc3993b8e4be7eade6dcc6fd59a412d96d3a33fa42b0fa45dc9e24495ede9",
"sha256:9508a7c628a165c2c835f2497837bf6ac80eb25291055f56c129df3c943cbaf8",
"sha256:a144811318853a23d32a07bc7fd5561ff0cac5da643d96ed94a4ffe967d89672",
"sha256:a1aae70d52d0b074d8121333bc807a485f9f1e6a69742010b33780df2e60cfe0",
"sha256:a2d6df9eb074af7f08866598e4ef068a2b310d98f87dc23bd1b90ec7bdcec802",
"sha256:bb517872058a1f087c4528e7429b4a44533a902644987e7b2fe35ecc223bc408",
"sha256:c5cac0c0387272ee0e789e94a570ac51deb01c796b37fb2aad1fb13f85e2f97d",
"sha256:cc971a82ea1170e677443108703a2ec9ff0f70752258d0e9f5433d00dda01f59",
"sha256:dba8306f6da99e37ea08c08fef6e274b5bf8567bb094d1dbe86a20e532aca088",
"sha256:dc60bb302f48acf6da8ca4444cfa17d52c63c5415302a9ee77b3b21618090521",
"sha256:dee1bbf3a6c8f73b6b218cb28eed8dd13347ea2f87d572ce19b289d6fd3fbc59"
],
"index": "pypi",
"version": "==1.4.1"
},
"six": {
"hashes": [
"sha256:236bdbdce46e6e6a3d61a337c0f8b763ca1e8717c03b369e87a7ec7ce1319c0a",
"sha256:8f3cd2e254d8f793e7f3d6d9df77b92252b52637291d0f0da013c76ea2724b6c"
],
"index": "pypi",
"version": "==1.14.0"
},
"tabulate": {
"hashes": [
"sha256:ac64cb76d53b1231d364babcd72abbb16855adac7de6665122f97b593f1eb2ba",
"sha256:db2723a20d04bcda8522165c73eea7c300eda74e0ce852d9022e0159d7895007"
],
"index": "pypi",
"version": "==0.8.7"
},
"urllib3": {
"hashes": [
"sha256:2f3db8b19923a873b3e5256dc9c2dedfa883e33d87c690d9c7913e1f40673cdc",
"sha256:87716c2d2a7121198ebcb7ce7cccf6ce5e9ba539041cfbaeecfb641dc0bf6acc"
],
"version": "==1.25.8"
},
"werkzeug": {
"hashes": [
"sha256:2de2a5db0baeae7b2d2664949077c2ac63fbd16d98da0ff71837f7d1dea3fd43",
"sha256:6c80b1e5ad3665290ea39320b91e1be1e0d5f60652b964a3070216de83d2e47c"
],
"version": "==1.0.1"
}
},
"develop": {}
}

43
backend/README.md Normal file
View File

@@ -0,0 +1,43 @@
&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
██████╗ ██╗███╗ ██╗███████╗
██╔══██╗██║████╗ ██║██╔════╝
██████╔╝██║██╔██╗ ██║█████╗
██╔═══╝ ██║██║╚██╗██║██╔══╝
██║ ██║██║ ╚████║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝╚══════╝
Pmap Interface for Nlp Experimentation
Web App Backend
## Development Environment
Developer pre-reqs:
* Python 3 with pip and pipenv
First-time setup:
* `pipenv install --dev`
- This will create a virtualenv and install the necessary packages.
Running the server:
* `./dev_run.sh`
Once test data has been set up in the eve layer, the script `setup_dev_data.sh`
can be used to set up data from the backend's perspective.
## Setup
Before running, you must edit ../.env and set `VEGAS_CLIENT_SECRET` appropriately
if you are using the "vegas" auth module. Alternatively set this secret as an
environment variable.
## Authentication:
* The "vegas" module is used by default.
* An "eve" module is also provided. This queries the eve server for users and uses those
for authentication. You can run `scripts/data/list_users.sh` to list available users in the
data server. No further configuration is needed locally.
## Production Environment:
This service can also be run as a Docker container.
It should be run using docker-compose at the top level (../).

22
backend/dev_run.sh Executable file
View File

@@ -0,0 +1,22 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
CONFIG_FILE="${DIR}/pine/backend/config.py"
if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
echo ""
echo ""
echo ""
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo "Please set VEGAS_CLIENT_SECRET environment variable"
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo ""
echo ""
echo ""
exit 1
fi
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask run

View File

@@ -0,0 +1,13 @@
import logging.config
import os
import json
bind = "0.0.0.0:${PORT}"
workers = ${WORKERS}
accesslog = "-"
if "PINE_LOGGING_CONFIG_FILE" in os.environ and os.path.isfile(os.environ["PINE_LOGGING_CONFIG_FILE"]):
with open(os.environ["PINE_LOGGING_CONFIG_FILE"], "r") as f:
c = json.load(f)
c["disable_existing_loggers"] = True
logconfig_dict = c

5
backend/docker/wsgi.py Normal file
View File

@@ -0,0 +1,5 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from pine.backend import create_app
app = create_app()

21
backend/docker_run.sh Executable file
View File

@@ -0,0 +1,21 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
GUNICORN_CONFIG_FILE="config.py"
if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
echo ""
echo ""
echo ""
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo "Please set VEGAS_CLIENT_SECRET environment variable"
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo ""
echo ""
echo ""
exit 1
fi
set -e
/usr/local/bin/gunicorn --config ${GUNICORN_CONFIG_FILE} wsgi:app

1
backend/pine/__init__.py Normal file
View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

View File

@@ -0,0 +1,3 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from .app import create_app

View File

@@ -0,0 +1,4 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""This module implements all methods required for user authentication when using PINE's db authentication rather than
an external Auth service"""

View File

@@ -0,0 +1,150 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from flask import Blueprint, jsonify, request, Response
import requests
from werkzeug import exceptions
from .. import auth
from ..auth import password
from ..data import service, users
bp = Blueprint("admin", __name__, url_prefix = "/admin")
@bp.route("/users", methods = ["GET"])
@auth.admin_required
def get_users():
"""
Get the list of all users' details (id, email, and password hash)
:return: str
"""
return jsonify(users.get_all_users())
@bp.route("/users/<user_id>", methods = ["GET"])
@auth.admin_required
def get_user(user_id):
"""
Given a user_id, return the user's details (id, email, and password hash)
:param user_id: str
:return: str
"""
return jsonify(users.get_user(user_id))
@bp.route("/users/<user_id>/password", methods = ["POST", "PUT", "PATCH"])
@auth.admin_required
def update_user_password(user_id):
"""
Change the password hash stored in the database for the given user to a newly calculated password hash derived from
the password provided in the json body of this request.
:param user_id:
:return: Response
"""
if not request.is_json:
raise exceptions.BadRequest()
body = request.get_json()
user = users.get_user(user_id)
etag = user["_etag"]
service.remove_nonupdatable_fields(user)
user["passwdhash"] = password.hash_password(body["passwd"])
resp = service.put("users/" + user_id, json = user, headers = {"If-Match": etag})
return service.convert_response(resp)
@bp.route("/users/<user_id>", methods = ["PUT", "PATCH"])
@auth.admin_required
def update_user(user_id):
"""
Change the details stored in the database for the given user to those provided in the json body of this request.
:param user_id: str
:return: Response
"""
if not request.is_json:
raise exceptions.BadRequest()
body = request.get_json()
etag = body["_etag"]
service.remove_nonupdatable_fields(body)
resp = service.put("users/" + user_id, json = body, headers = {"If-Match": etag})
return service.convert_response(resp)
@bp.route("/users", methods = ["POST"])
@auth.admin_required
def add_user():
"""
Add a new user to PINE, with the details provided in the json body of this request (id, email, and password hash).
This method will calculate and store a password hash based upon the provided password
:return: Response
"""
if not request.is_json:
raise exceptions.BadRequest()
body = request.get_json()
# first check that username and email are not already in user
if not "id" in body or not body["id"]:
raise exceptions.BadRequest(description = "Missing id in body JSON data.")
try:
user = users.get_user(body["id"])
if user != None:
raise exceptions.Conflict(description = "User with id {} already exists.".format(body["username"]))
except exceptions.NotFound: pass
if not "email" in body or not body["email"]:
raise exceptions.BadRequest(description = "Missing email in body JSON data.")
user = users.get_user_by_email(body["email"])
if user != None:
raise exceptions.Conflict(description = "User with email {} already exists.".format(body["email"]))
# replace the password with a hash
if not "passwd" in body or not body["passwd"]:
raise exceptions.BadRequest(description = "Missing passwd in body JSON data.")
body["passwdhash"] = password.hash_password(body["passwd"])
del body["passwd"]
body["_id"] = body["id"]
del body["id"]
if body["description"] == None:
del body["description"]
# post to data server
resp = service.post("users", json = body)
return service.convert_response(resp)
@bp.route("/users/<user_id>", methods = ["DELETE"])
@auth.admin_required
def delete_user(user_id):
"""
Delete the user matching the given user_id
:param user_id: str
:return: Response
"""
# make sure you're not deleting the logged in user
if user_id == auth.get_logged_in_user()["id"]:
return jsonify({"success": False, "error": "Cannot delete currently logged in user."}), exceptions.Conflict.code
# TODO if that user is logged in, force log them out
# delete user
user = users.get_user(user_id)
headers = {"If-Match": user["_etag"]}
return service.convert_response(service.delete("users/" + user_id, headers = headers))
@bp.route("/system/export", methods = ["GET"])
@auth.admin_required
def system_export():
"""
Export the contents of the database as a zip file
:return: Response
"""
resp = service.convert_response(service.get("system/export"))
resp.headers["Access-Control-Expose-Headers"] = "Content-Disposition"
return resp
@bp.route("/system/import", methods = ["PUT", "POST"])
@auth.admin_required
def system_import():
"""
Import the contents of the data provided in the request body to the database
:return: Response
"""
return service.convert_response(
requests.request(request.method, service.url("system", "import"), data = request.get_data(),
headers = request.headers))
def init_app(app):
app.register_blueprint(bp)

View File

@@ -0,0 +1,3 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""This module contains the api methods required to perform and display annotations in the front-end and store the
annotations in the backend"""

View File

@@ -0,0 +1,306 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import logging
from flask import abort, Blueprint, jsonify, request
from werkzeug import exceptions
from .. import auth, log
from ..data import service
from ..documents import bp as documents
"""This module contains the api methods required to perform and display annotations in the front-end and store the
annotations in the backend"""
logger = logging.getLogger(__name__)
CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS = "allow_overlapping_ner_annotations"
bp = Blueprint("annotations", __name__, url_prefix = "/annotations")
def check_document(doc_id):
"""
Verify that a document with the given doc_id exists and that the logged in user has permissions to access the
document
:param doc_id: str
:return: dict
"""
if not documents.user_can_view_by_id(doc_id):
raise exceptions.Unauthorized()
@bp.route("/mine/by_document_id/<doc_id>")
@auth.login_required
def get_my_annotations_for_document(doc_id):
"""
Get the list of annotations (key, start_index, end_index) produced by the logged in user for the document matching
the provided doc_id.
:param doc_id: str
:return: Response
"""
check_document(doc_id)
where = {
"document_id": doc_id,
"creator_id": auth.get_logged_in_user()["id"]
}
resp = service.get("annotations", params = service.where_params(where))
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
@bp.route("/others/by_document_id/<doc_id>")
@auth.login_required
def get_others_annotations_for_document(doc_id):
"""
Get the list of annotations (key, start_index, end_index) produced by all other users, not including the logged in
user for the document matching the provided doc_id.
:param doc_id: str
:return: str
"""
check_document(doc_id)
where = {
"document_id": doc_id,
# $eq doesn't work here for some reason -- maybe because objectid?
"creator_id": { "$not": { "$in": [auth.get_logged_in_user()["id"]] } }
}
resp = service.get("annotations", params = service.where_params(where))
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
@bp.route("/by_document_id/<doc_id>")
@auth.login_required
def get_annotations_for_document(doc_id):
"""
Get the list of annotations (key, start_index, end_index) produced by all users for the document matching the
provided doc_id.
:param doc_id: str
:return: str
"""
check_document(doc_id)
where = {
"document_id": doc_id
}
resp = service.get("annotations", params = service.where_params(where))
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
def get_current_annotation(doc_id, user_id):
"""
Get all annotations of the provided document created by the given user.
:param doc_id: str
:param user_id: str
:return: List
"""
where = {
"document_id": doc_id,
"creator_id": user_id
}
annotations = service.get_items("/annotations", service.where_params(where))
if len(annotations) > 0:
return annotations[0]
else:
return None
def is_ner_annotation(ann):
"""
Verify that the provided annotation is in the valid format for an NER Annotation
:param ann: Any
:return: Bool
"""
return (type(ann) is list or type(ann) is tuple) and len(ann) == 3
def check_overlapping_annotations(document, ner_annotations):
ner_annotations.sort(key = lambda x: x[0])
resp = service.get("collections/" + document["collection_id"])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
# if allow_overlapping_ner_annotations is false, check them
if "configuration" in collection and CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS in collection["configuration"] and not collection["configuration"][CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS]:
for idx, val in enumerate(ner_annotations):
if idx == 0: continue
prev = ner_annotations[idx - 1]
if val[0] < prev[1]:
raise exceptions.BadRequest("Collection is configured not to allow overlapping annotations")
@bp.route("/mine/by_document_id/<doc_id>/ner", methods = ["POST", "PUT"])
@auth.login_required
def save_ner_annotations(doc_id):
"""
Save new NER annotations to the database as an entry for the logged in user, for the document. If there are already
annotations, use a patch request to update with the new annotations. If there are not, use a post request to create
a new entry.
:param doc_id: str
:return: str
"""
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
"collection_id": 1,
"metadata": 1
})
})
annotations = request.get_json()
user_id = auth.get_logged_in_user()["id"]
annotations = [(ann["start"], ann["end"], ann["label"]) for ann in annotations]
check_overlapping_annotations(document, annotations)
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": annotations
}
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == annotations:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
# add all the other non-ner labels
for annotation in current_annotation["annotation"]:
if not is_ner_annotation(annotation):
new_annotation["annotation"].append(annotation)
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
resp = service.post("annotations", json = new_annotation)
if resp.ok:
new_annotation["_id"] = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
def is_doc_annotation(ann):
"""
Verify that an annotation has the correct format (string)
:param ann: Any
:return: Bool
"""
return isinstance(ann, str)
@bp.route("/mine/by_document_id/<doc_id>/doc", methods = ["POST", "PUT"])
@auth.login_required
def save_doc_labels(doc_id):
"""
Save new labels to the database as an entry for the logged in user, for the document. If there are already
annotations/labels, use a patch request to update with the new labels. If there are not, use a post request to
create a new entry.
:param doc_id:
:return:
"""
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
"collection_id": 1,
"metadata": 1
})
})
labels = request.get_json()
user_id = auth.get_logged_in_user()["id"]
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": labels
}
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == labels:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
# add all the other non-doc labels
for annotation in current_annotation["annotation"]:
if not is_doc_annotation(annotation):
new_annotation["annotation"].append(annotation)
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
resp = service.post("annotations", json = new_annotation)
if resp.ok:
new_annotation = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
def set_document_to_annotated_by_user(doc_id, user_id):
"""
Modify the parameter in the database for the document signifying that the given user has annotated the given
document
:param doc_id: str
:param user_id: str
:return: Response | None
"""
document = service.get_item_by_id("/documents", doc_id)
new_document = {
"has_annotated": document["has_annotated"]
}
new_document["has_annotated"][user_id] = True
headers = {"If-Match": document["_etag"]}
return service.patch(["documents", doc_id], json=new_document, headers=headers).ok
@bp.route("/mine/by_document_id/<doc_id>", methods = ["POST", "PUT"])
def save_annotations(doc_id):
"""
Save new NER annotations and labels to the database as an entry for the logged in user, for the document. If there
are already annotations, use a patch request to update with the new annotations. If there are not, use a post
request to create a new entry.
:param doc_id: str
:return: str
"""
if not request.is_json:
raise exceptions.BadRequest()
check_document(doc_id)
document = service.get_item_by_id("documents", doc_id, {
"projection": json.dumps({
"collection_id": 1,
"metadata": 1
})
})
body = request.get_json()
if "doc" not in body or "ner" not in body:
raise exceptions.BadRequest()
labels = body["doc"]
annotations = [(ann["start"], ann["end"], ann["label"]) for ann in body["ner"]]
check_overlapping_annotations(document, annotations)
user_id = auth.get_logged_in_user()["id"]
new_annotation = {
"creator_id": user_id,
"collection_id": document["collection_id"],
"document_id": doc_id,
"annotation": labels + annotations
}
current_annotation = get_current_annotation(doc_id, user_id)
if current_annotation != None:
if current_annotation["annotation"] == new_annotation["annotation"]:
return jsonify(True)
headers = {"If-Match": current_annotation["_etag"]}
resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
else:
updated_annotated_field = set_document_to_annotated_by_user(doc_id, user_id)
resp = service.post("annotations", json = new_annotation)
if resp.ok:
new_annotation["_id"] = resp.json()["_id"]
log.access_flask_annotate_document(document, new_annotation)
return jsonify(resp.ok)
def init_app(app):
app.register_blueprint(bp)

View File

@@ -0,0 +1,84 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import logging
import os
from . import log
log.setup_logging()
from flask import Flask, jsonify
from werkzeug import exceptions
from . import config
def handle_error(e):
logging.getLogger(__name__).error(e, exc_info=True)
return jsonify(e.description), e.code
def handle_uncaught_exception(e):
if isinstance(e, exceptions.InternalServerError):
original = getattr(e, "original_exception", None)
if original is None:
logging.getLogger(__name__).error(e, exc_info=True)
else:
logging.getLogger(__name__).error(original, exc_info=True)
return jsonify(e.description), e.code
elif isinstance(e, exceptions.HTTPException):
return handle_error(e)
else:
logging.getLogger(__name__).error(e, exc_info=True)
return jsonify(exceptions.InternalServerError.description), exceptions.InternalServerError.code
def create_app(test_config = None):
# create and configure the app
app = Flask(__name__, instance_relative_config=True)
app.config.from_object(config)
app.register_error_handler(exceptions.HTTPException, handle_error)
app.register_error_handler(Exception, handle_uncaught_exception)
if test_config is None:
# load the instance config, if it exists, when not testing
app.config.from_pyfile("config.py", silent=True)
else:
# load the test config if passed in
app.config.from_mapping(test_config)
# ensure the instance folder exists
try:
os.makedirs(app.instance_path)
except OSError:
pass
@app.route("/ping")
def ping():
return jsonify("pong")
from . import cors
cors.init_app(app)
from .auth import bp as authbp
authbp.init_app(app)
from .data import bp as databp
databp.init_app(app)
from .collections import bp as collectionsbp
collectionsbp.init_app(app)
from .documents import bp as documentsbp
documentsbp.init_app(app)
from .annotations import bp as annotationsbp
annotationsbp.init_app(app)
from .admin import bp as adminbp
adminbp.init_app(app)
from .pipelines import bp as pipelinebp
pipelinebp.init_app(app)
from .pineiaa import bp as iaabp
iaabp.init_app(app)
return app

View File

@@ -0,0 +1,6 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from .bp import login_required, admin_required, module, get_logged_in_user, is_flat
"""This module implements all methods required for user authentication and user verification with eve, oauth, and
vegas"""

View File

@@ -0,0 +1,116 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import abc
import functools
from flask import Blueprint, current_app, jsonify, request, Response, session
from werkzeug import exceptions
from .. import log, models
CONFIG_AUTH_MODULE_KEY = "AUTH_MODULE"
bp = Blueprint("auth", __name__, url_prefix = "/auth")
module = None
def is_flat():
return module.is_flat()
def get_logged_in_user():
return module.get_logged_in_user()
def login_required(view):
@functools.wraps(view)
def wrapped_view(*args, **kwargs):
if module.get_logged_in_user() == None and request.method.lower() != "options":
raise exceptions.Unauthorized(description = "Must be logged in.")
return view(*args, **kwargs)
return wrapped_view
def admin_required(view):
@functools.wraps(view)
def wrapped_view(*args, **kwargs):
user = module.get_logged_in_user()
if user == None and request.method.lower() != "options":
raise exceptions.Unauthorized(description = "Must be logged in.")
if not user["is_admin"]:
raise exceptions.Unauthorized(description = "Must be an admin.")
return view(*args, **kwargs)
return wrapped_view
@bp.route("/module", methods = ["GET"])
def flask_get_module():
return jsonify(current_app.config[CONFIG_AUTH_MODULE_KEY])
@bp.route("/flat", methods = ["GET"])
def flask_get_flat() -> Response:
return jsonify(module.is_flat())
@bp.route("/can_manage_users", methods = ["GET"])
def flask_get_can_manage_users() -> Response:
return jsonify(module.can_manage_users())
@bp.route("/logged_in_user", methods = ["GET"])
def flask_get_logged_in_user() -> Response:
return jsonify(module.get_logged_in_user())
@bp.route("/logged_in_user_details", methods = ["GET"])
def flask_get_logged_in_user_details() -> Response:
return jsonify(module.get_logged_in_user_details().to_dict())
@bp.route("/login_form", methods = ["GET"])
def flask_get_login_form() -> Response:
return jsonify(module.get_login_form().to_dict())
@bp.route("/logout", methods = ["POST"])
def flask_post_logout() -> Response:
user = module.get_logged_in_user()
module.logout()
log.access_flask_logout(user)
return Response(status = 200)
class AuthModule(object):
__metaclass__ = abc.ABCMeta
def __init__(self, app, bp):
pass
@abc.abstractmethod
def is_flat(self) -> bool:
pass
@abc.abstractmethod
def can_manage_users(self) -> bool:
pass
@abc.abstractmethod
def get_login_form(self) -> models.LoginForm:
pass
def get_logged_in_user(self):
return session["auth"]["user"] if "auth" in session else None
def get_logged_in_user_details(self) -> models.UserDetails:
return None
def logout(self):
if "auth" in session:
del session["auth"]
def init_app(app):
module_config = app.config[CONFIG_AUTH_MODULE_KEY]
global module
if module_config == "vegas":
from . import vegas
module = vegas.VegasAuthModule(app, bp)
elif module_config == "eve":
from . import eve
module = eve.EveModule(app, bp)
else:
raise ValueError("Unknown auth module: {}".format(module_config))
app.register_blueprint(bp)

View File

@@ -0,0 +1,118 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from flask import jsonify, request, Response, session
from overrides import overrides
from werkzeug import exceptions
from . import bp, login_required, password
from .. import log, models
from ..data import users
class EveUser(models.AuthUser):
def __init__(self, data):
super(EveUser, self).__init__()
self.data = data
@property
@overrides
def id(self):
return self.data["_id"]
@property
@overrides
def username(self):
return self.data["email"]
@property
@overrides
def display_name(self):
return "{} {} ({})".format(self.data["firstname"], self.data["lastname"], self.username)
@property
@overrides
def is_admin(self):
return "administrator" in self.data["role"]
def get_details(self) -> models.UserDetails:
return models.UserDetails(self.data["firstname"], self.data["lastname"], self.data["description"])
class EveModule(bp.AuthModule):
def __init__(self, app, bp):
super(EveModule, self).__init__(app, bp)
bp.route("/login", methods = ["POST"])(self.login)
login_required(bp.route("/users", methods = ["GET"])(self.get_all_users))
login_required(bp.route("/logged_in_user_details", methods = ["POST"])(self.update_user_details))
login_required(bp.route("/logged_in_user_password", methods = ["POST"])(self.update_user_password))
@overrides
def is_flat(self) -> bool:
return False
@overrides
def can_manage_users(self) -> bool:
return True
@overrides
def get_logged_in_user_details(self):
return EveUser(session["auth"]["user_data"]).get_details()
def update_user_details(self):
details = models.UserDetails(**request.get_json())
ret = users.update_user(self.get_logged_in_user()["id"], details)
new_user_data = users.get_user(ret["_id"])
session["auth"] = {
"user": EveUser(new_user_data).to_dict(),
"user_data": new_user_data
}
return jsonify(True)
def update_user_password(self):
body = request.get_json()
current_password = body["current_password"]
new_password = body["new_password"]
valid = password.check_password(current_password, session["auth"]["user_data"]["passwdhash"])
if not valid:
raise exceptions.Unauthorized(description = "Current password does not match.")
users.set_user_password_by_id(self.get_logged_in_user()["id"], new_password)
return jsonify(True)
@overrides
def get_login_form(self) -> models.LoginForm:
return models.LoginForm([
models.LoginFormField("username", "Username or email", models.LoginFormFieldType.TEXT),
models.LoginFormField("password", "Password", models.LoginFormFieldType.PASSWORD)
], "Login")
def login(self) -> Response:
if not request.json or "username" not in request.json or "password" not in request.json:
raise exceptions.BadRequest(description = "Missing username and/or password.")
username = request.json["username"]
passwd = request.json["password"]
try:
user = users.get_user(username)
except exceptions.HTTPException:
try:
user = users.get_user_by_email(username)
if not user:
raise exceptions.Unauthorized(description = "User \"{}\" doesn't exist.".format(username))
except exceptions.HTTPException:
raise exceptions.Unauthorized(description = "User \"{}\" doesn't exist.".format(username))
if not "passwdhash" in user or not user["passwdhash"]:
raise exceptions.Unauthorized(description = "Your first-time password needs to be set by an administrator.")
valid = password.check_password(passwd, user["passwdhash"])
if not valid:
raise exceptions.Unauthorized(description = "Incorrect password for user \"{}\".".format(username))
session["auth"] = {
"user": EveUser(user).to_dict(),
"user_data": user
}
log.access_flask_login()
return jsonify(self.get_logged_in_user())
def get_all_users(self):
ret = []
for user in users.get_all_users():
ret.append(EveUser(user).to_dict())
return jsonify(ret)

View File

@@ -0,0 +1,101 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import abc
import traceback
from authlib.integrations.flask_client import OAuth
from flask import current_app, jsonify, redirect, request, Response, session
import jwt
from overrides import overrides
from werkzeug import exceptions
from . import bp
from .. import log, models
#{'sub': 'lglende1', 'aud': 'NLP Platform', 'subject': 'lglende1', 'iss': 'https://slife.jh.edu', 'token_type': 'access_token', 'exp': 1563587426, 'expires_in': 3600, 'iat': 1563583826, 'email': 'lglende1@jhu.edu', 'client_id': '1976d9d4-be86-44ce-aa0f-c5a4b295c701'}
class OAuthUser(models.AuthUser):
def __init__(self, data):
super(OAuthUser, self).__init__()
self.data = data
@property
@overrides
def id(self):
return self.data["subject"]
@property
@overrides
def username(self):
return self.data["subject"]
@property
@overrides
def display_name(self):
return self.data["subject"]
@property
@overrides
def is_admin(self):
return False
class OAuthModule(bp.AuthModule):
def __init__(self, app, bp):
super(OAuthModule, self).__init__(app, bp)
self.oauth = OAuth()
self.oauth.init_app(app)
self.app = self.register_oauth(self.oauth, app)
bp.route("/login", methods = ["POST"])(self.login)
bp.route("/authorize", methods = ["GET"])(self.authorize)
@abc.abstractmethod
def register_oauth(self, oauth, app):
pass
@abc.abstractmethod
def get_login_form_button_text(self):
pass
@overrides
def is_flat(self) -> bool:
return True
@overrides
def can_manage_users(self) -> bool:
return False
@overrides
def get_login_form(self) -> models.LoginForm:
return models.LoginForm([], self.get_login_form_button_text())
def login(self) -> Response:
if "return_to" in request.args:
redirect = self.app.authorize_redirect(request.args.get("return_to"), response_type = "token")
else:
redirect = self.app.authorize_redirect(response_type = "token")
return jsonify(redirect.headers["Location"])
def authorize(self):
authorization_response = request.full_path + "#" + request.args.get("fragment")
try:
token = self.app.fetch_access_token(authorization_response = authorization_response)
except Exception as e:
traceback.print_exc()
raise exceptions.SecurityError(description = str(e))
access_token = token["access_token"]
secret = current_app.config["VEGAS_CLIENT_SECRET"]
try:
decoded = jwt.decode(access_token, secret, verify = False)
decoded = jwt.decode(access_token, secret, verify = True, audience = decoded["aud"])
except jwt.exceptions.InvalidTokenError as e:
traceback.print_exc()
raise exceptions.SecurityError(description = str(e))
session["auth"] = {
"user": OAuthUser(decoded).to_dict(),
"user_data": decoded,
"token": token
}
log.access_flask_login()
return jsonify(self.get_logged_in_user())

View File

@@ -0,0 +1,15 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import base64
import bcrypt
import hashlib
def hash_password(password: str) -> str:
sha256 = hashlib.sha256(password.encode()).digest()
hashed_password_bytes = bcrypt.hashpw(sha256, bcrypt.gensalt())
return base64.b64encode(hashed_password_bytes).decode()
def check_password(password: str, hashed_password: str):
sha256 = hashlib.sha256(password.encode()).digest()
hashed_password_bytes = base64.b64decode(hashed_password.encode())
return bcrypt.checkpw(sha256, hashed_password_bytes)

View File

@@ -0,0 +1,35 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from overrides import overrides
from . import oauth
#VEGAS_SERVER = "https://slife.jh.edu"
#VEGAS_CLIENT_ID = "1976d9d4-be86-44ce-aa0f-c5a4b295c701"
#VEGAS_LOGOUT = VEGAS_SERVER +'/VEGAS/saml/logout'
#VEGAS_AUTHORIZE_URL = "{}/VEGAS/oauth/authorize".format(VEGAS_SERVER)
#VEGAS_ACCESS_TOKEN_URL = "{}/VEGAS/api/oauth2/token".format(VEGAS_SERVER)
class VegasAuthModule(oauth.OAuthModule):
def __init__(self, app, bp):
super(VegasAuthModule, self).__init__(app, bp)
@overrides
def register_oauth(self, oauth, app):
server = app.config.get("VEGAS_SERVER", "https://slife.jh.edu")
return oauth.register(
name = "vegas",
client_id = app.config.get("VEGAS_CLIENT_ID", "1976d9d4-be86-44ce-aa0f-c5a4b295c701"),
client_secret = app.config["VEGAS_CLIENT_SECRET"],
access_token_url = None,
access_token_params = None,
refresh_token_url = app.config.get("VEGAS_ACCESS_TOKEN_URL", "{}/VEGAS/oauth/token".format(server)),
authorize_url = app.config.get("VEGAS_AUTHORIZE_URL", "{}/VEGAS/oauth/authorize".format(server)),
api_base_url = app.config.get("VEGAS_API_BASE_URL", "{}/VEGAS/oauth/".format(server)),
client_kwargs = None
)
@overrides
def get_login_form_button_text(self):
return "Login With JHED"

View File

@@ -0,0 +1,6 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""This module contains the api methods required to interact with, organize, create, and display collections in the
front-end and store the collections in the backend"""
from .bp import user_can_annotate, user_can_view, user_can_add_documents_or_images, user_can_modify_document_metadata, user_can_annotate_by_id, user_can_view_by_id, user_can_add_documents_or_images_by_id, user_can_modify_document_metadata_by_id

View File

@@ -0,0 +1,650 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import csv
import io
import json
import logging
import os
import random
import traceback
import unicodedata
from flask import abort, Blueprint, current_app, jsonify, request, safe_join, send_file, send_from_directory
from werkzeug import exceptions
from .. import auth, log
from ..data import service
bp = Blueprint("collections", __name__, url_prefix = "/collections")
logger = logging.getLogger(__name__)
def _collection_user_can_projection():
return {"projection": json.dumps({
"creator_id": 1,
"annotators": 1,
"viewers": 1
})}
def _collection_user_can(collection, annotate):
user_id = auth.get_logged_in_user()["id"]
if annotate and not auth.is_flat():
return collection["creator_id"] == user_id or user_id in collection["annotators"]
else:
return collection["creator_id"] == user_id or user_id in collection["viewers"] or user_id in collection["annotators"]
def user_can_annotate(collection):
return _collection_user_can(collection, annotate = True)
def user_can_view(collection):
return _collection_user_can(collection, annotate = False)
def user_can_add_documents_or_images(collection):
# for now, this is the same thing as just being able to view
return user_can_view(collection)
def user_can_modify_document_metadata(collection):
# for now, this is the same thing as just being able to view
return user_can_view(collection)
def user_can_annotate_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return _collection_user_can(collection, annotate = True)
def user_can_view_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return _collection_user_can(collection, annotate = False)
def user_can_add_documents_or_images_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return user_can_add_documents_or_images(collection)
def user_can_modify_document_metadata_by_id(collection_id):
collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
return user_can_modify_document_metadata(collection)
def get_user_collections(archived, page):
"""
Return collections for the logged in user using pagination. Returns all collections if parameter "page" is "all",
or the collections associated with the given page. Can return archived or un-archived collections based upon the
"archived" flag.
:param archived: Bool
:param page: str
:return: Response
"""
user_id = auth.get_logged_in_user()["id"]
where = {
"archived": archived,
"$or": [
{"creator_id": user_id},
{"viewers": user_id},
{"annotators": user_id}
]
}
params = service.where_params(where)
if page == "all":
return jsonify(service.get_all_using_pagination("collections", params))
if page: params["page"] = page
resp = service.get("collections", params = params)
return service.convert_response(resp)
@bp.route("/unarchived", defaults = {"page": "all"}, methods = ["GET"])
@bp.route("/unarchived/<page>", methods = ["GET"])
@auth.login_required
def get_unarchived_user_collections(page):
"""
Return unarchived user collections for the corresponding page value. Default value returns collections for all
pages.
:param page: str
:return: Response
"""
return get_user_collections(False, page)
@bp.route("/archived", defaults = {"page": "all"}, methods = ["GET"])
@bp.route("/archived/<page>", methods = ["GET"])
@auth.login_required
def get_archived_user_collections(page):
"""
Return archived user collections for the corresponding page value. Default value returns collections for all
pages.
:param page: str
:return: Response
"""
return get_user_collections(True, page)
def archive_or_unarchive_collection(collection_id, archive):
"""
Set the "archived" boolean flag for the collection matching the provided collection_id.
:param collection_id: str
:param archive: Bool
:return: Response
"""
user_id = auth.get_logged_in_user()["id"]
resp = service.get("collections/" + collection_id)
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
if not auth.is_flat() and collection["creator_id"] != user_id:
raise exceptions.Unauthorized("Only the creator can archive a collection.")
collection["archived"] = archive
headers = {"If-Match": collection["_etag"]}
service.remove_nonupdatable_fields(collection)
resp = service.put(["collections", collection_id], json = collection, headers = headers)
if not resp.ok:
abort(resp.status_code)
return get_collection(collection_id)
@bp.route("/archive/<collection_id>", methods = ["PUT"])
@auth.login_required
def archive_collection(collection_id):
"""
Archive the collection matching the provided collection id
:param collection_id: str
:return: Response
"""
return archive_or_unarchive_collection(collection_id, True)
@bp.route("/unarchive/<collection_id>", methods = ["PUT"])
@auth.login_required
def unarchive_collection(collection_id):
"""
Unarchive the collection matching the provided collection id
:param collection_id: str
:return: Response
"""
return archive_or_unarchive_collection(collection_id, False)
@bp.route("/by_id/<collection_id>", methods = ["GET"])
@auth.login_required
def get_collection(collection_id):
"""
Return the collection object for the collection matching the provided collection id. This object has the fields:
'creator_id', 'annotators', 'viewers', 'labels', 'metadata', 'archived', and 'configuration'.
:param collection_id: str
:return: Response
"""
resp = service.get("collections/" + collection_id)
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
if user_can_view(collection):
return service.convert_response(resp)
else:
raise exceptions.Unauthorized()
@bp.route("/by_id/<collection_id>/download", methods = ["GET"])
@auth.login_required
def download_collection(collection_id):
resp = service.get("/collections/" + collection_id)
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
if not user_can_view(collection):
return exceptions.Unauthorized()
def flag(name): return name not in request.args or json.loads(request.args[name])
include_collection_metadata = flag("include_collection_metadata")
include_document_metadata = flag("include_document_metadata")
include_document_text = flag("include_document_text")
include_annotations = flag("include_annotations")
include_annotation_latest_version_only = flag("include_annotation_latest_version_only")
as_file = flag("as_file")
if include_collection_metadata:
col = dict(collection)
service.remove_eve_fields(col)
data = col
else:
data = {
"_id": collection["_id"]
}
params = service.where_params({
"collection_id": collection_id
})
if not include_document_metadata and not include_document_text:
params["projection"] = json.dumps({
"_id": 1
})
elif not include_document_metadata:
params["projection"] = json.dumps({
"_id": 1,
"text": 1
})
elif not include_document_text:
params["projection"] = json.dumps({
"text": 0,
"collection_id": 0
})
else:
params["projection"] = json.dumps({
"collection_id": 0
})
data["documents"] = service.get_all_using_pagination("documents", params)["_items"]
for document in data["documents"]:
service.remove_eve_fields(document)
if include_annotations:
params = service.where_params({
"document_id": document["_id"]
})
params["projection"] = json.dumps({
"collection_id": 0,
"document_id": 0
})
annotations = service.get_all_using_pagination("annotations", params)["_items"]
if include_annotation_latest_version_only:
document["annotations"] = annotations
else:
params = {"projection": params["projection"]}
document["annotations"] = []
for annotation in annotations:
all_versions = service.get_all_versions_of_item_by_id("annotations", annotation["_id"],
params=params)
document["annotations"] += all_versions["_items"]
for annotation in document["annotations"]:
service.remove_eve_fields(annotation,
remove_versions = include_annotation_latest_version_only)
if as_file:
data_bytes = io.BytesIO()
data_bytes.write(json.dumps(data).encode())
data_bytes.write(b"\n")
data_bytes.seek(0)
return send_file(
data_bytes,
as_attachment=True,
attachment_filename="collection_{}.json".format(collection_id),
mimetype="application/json"
)
else:
return jsonify(data)
def get_doc_and_overlap_ids(collection_id):
"""
Return lists of ids for overlapping and non-overlapping documents for the collection matching the provided
collection id.
:param collection_id: str
:return: tuple
"""
params = service.params({
"where": {"collection_id": collection_id, "overlap": 0},
"projection": {"_id": 1}
})
doc_ids = [doc["_id"] for doc in service.get_all_using_pagination("documents", params)['_items']]
# doc_ids = get_all_ids("documents?where={\"collection_id\":\"%s\",\"overlap\":0}"%(collection_id))
random.shuffle(doc_ids)
overlap_ids = get_overlap_ids(collection_id)
return (doc_ids, overlap_ids)
@bp.route("/add_annotator/<collection_id>", methods=["POST"])
@auth.login_required
def add_annotator_to_collection(collection_id):
user_id = json.loads(request.form["user_id"])
resp = service.get(["collections", collection_id])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
auth_id = auth.get_logged_in_user()["id"]
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if user_id not in collection["annotators"]:
logger.info("new annotator: adding to collection")
collection["annotators"].append(user_id)
if user_id not in collection["viewers"]:
collection["viewers"].append(user_id)
to_patch = {
"annotators": collection["annotators"],
"viewers": collection["viewers"]
}
else:
to_patch = {
"annotators": collection["annotators"]
}
headers = {'Content-Type': 'application/json', 'If-Match': collection["_etag"]}
resp = service.patch(["collections", collection["_id"]], json=to_patch, headers=headers)
if not resp.ok:
abort(resp.status_code, resp.content)
return service.convert_response(resp)
else:
abort(409, "Annotator already exists in collection")
@bp.route("/add_viewer/<collection_id>", methods=["POST"])
@auth.login_required
def add_viewer_to_collection(collection_id):
user_id = json.loads(request.form["user_id"])
resp = service.get(["collections", collection_id])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
auth_id = auth.get_logged_in_user()["id"]
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if user_id not in collection["viewers"]:
logger.info("new viewer: adding to collection")
collection["viewers"].append(user_id)
to_patch = {
"viewers": collection["viewers"]
}
headers = {'Content-Type': 'application/json', 'If-Match': collection["_etag"]}
resp = service.patch(["collections", collection["_id"]], json=to_patch, headers=headers)
if not resp.ok:
abort(resp.status_code, resp.content)
return service.convert_response(resp)
else:
abort(409, "Annotator already exists in collection")
@bp.route("/add_label/<collection_id>", methods=["POST"])
@auth.login_required
def add_label_to_collection(collection_id):
new_label = json.loads(request.form["new_label"])
resp = service.get(["collections", collection_id])
if not resp.ok:
abort(resp.status_code)
collection = resp.json()
auth_id = auth.get_logged_in_user()["id"]
if not (collection["creator_id"] == auth_id):
raise exceptions.Unauthorized()
if new_label not in collection["labels"]:
logger.info("new viewer: adding to collection")
collection["labels"].append(new_label)
to_patch = {
"labels": collection["labels"]
}
headers = {'Content-Type': 'application/json', 'If-Match': collection["_etag"]}
resp = service.patch(["collections", collection["_id"]], json=to_patch, headers=headers)
if not resp.ok:
abort(resp.status_code, resp.content)
return service.convert_response(resp)
else:
abort(409, "Annotator already exists in collection")
def get_overlap_ids(collection_id):
"""
Return the list of ids for overlapping documents for the collection matching the provided collection id.
:param collection_id: str
:return: tuple
"""
where = {"collection_id": collection_id, "overlap": 1}
params = service.where_params(where)
return [doc["_id"] for doc in service.get_all_using_pagination("documents", params)['_items']]
# Require a multipart form post:
# CSV is in the form file "file"
# Optional images are in the form file fields "imageFileN" where N is an (ignored) index
# If CSV is provided, the fields "csvTextCol" and "csvHasHeader" must also be provided in the form
# Collection jSON string is in the form field "collection"
# Pipeline ID is in the form field "pipelineId"
# Overlap is in the form field "overlap"
# Train every is in the form field "train_every"
# Any classifier parameters are in the form field "classifierParameters"
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
def create_collection():
"""
Create a new collection based upon the entries provided in the POST request's associated form fields.
These fields include:
collection - collection name
overlap - ratio of overlapping documents. (0-1) with 0 being no overlap and 1 being every document has overlap, ex:
.90 - 90% of documents overlap
train_every - automatically train a new classifier after this many documents have been annotated
pipeline_id - the id value of the classifier pipeline associated with this collection (spacy, opennlp, corenlp)
classifierParameters - optional parameters that adjust the configuration of the chosen classifier pipeline.
archived - whether or not this collection should be archived.
A collection can be created with documents listed in a csv file. Each new line in the csv represents a new document.
The data of this csv can be passed to this method through the POST request's FILES field "file".
used when creating a collection based on an uploaded csv file:
csvTextCol - column of csv containing the text of the documents (default: 0)
csvHasHeader - boolean for whether or not the csv file has a header row (default: False)
A collection can also be created with a number of images through FILES fields "imageFileN" where N is an (ignored) index
:return: information about the created collection
"""
user_id = auth.get_logged_in_user()["id"]
try:
#posted_file = StringIO(str(request.files["document"].read(), "utf-8"))
#posted_data = json.loads(str(request.files["data"].read(), "utf-8"))
if "file" in request.files:
posted_file = io.StringIO(str(request.files["file"].read(), "utf-8-sig"))
csv_text_col = json.loads(request.form["csvTextCol"])
csv_has_header = json.loads(request.form["csvHasHeader"])
else:
posted_file = None
csv_text_col = None
csv_has_header = None
collection = json.loads(request.form["collection"])
overlap = json.loads(request.form["overlap"])
train_every = json.loads(request.form["train_every"])
pipeline_id = json.loads(request.form["pipelineId"])
if "classifierParameters" in request.form:
classifier_parameters = json.loads(request.form["classifierParameters"])
else:
classifier_parameters = None
image_files = []
for key in request.files:
if key and key.startswith("imageFile"):
image_files.append(request.files[key])
except Exception as e:
traceback.print_exc()
abort(400, "Error parsing input:" + str(e))
if collection["creator_id"] != user_id:
abort(exceptions.Unauthorized, "Can't create collections for other users.")
if "archived" not in collection:
collection["archived"] = False
if "configuration" not in collection:
collection["configuration"] = {}
# create collection
create_resp = service.post("/collections", json = collection)
if not create_resp.ok:
abort(create_resp.status_code, create_resp.content)
r = create_resp.json()
if r["_status"] != "OK":
abort(400, "Unable to create collection")
collection_id = r["_id"]
collection["_id"] = collection_id
log.access_flask_add_collection(collection)
logger.info("Created collection", collection_id)
#create classifier
# require collection_id, overlap, pipeline_id and labels
classifier_obj = {"collection_id": collection_id,
"overlap": overlap,
"pipeline_id": pipeline_id,
#"parameters": pipeline_params,
"labels": collection["labels"],
"train_every": train_every}
# filename?
if classifier_parameters:
classifier_obj["parameters"] = classifier_parameters
classifier_resp = service.post("/classifiers", json = classifier_obj)
# TODO: if it failed, roll back the created collection
if not classifier_resp.ok:
abort(classifier_resp.status_code, classifier_resp.content)
r = classifier_resp.json()
if r["_status"] != "OK":
abort(400, "Unable to create classifier")
classifier_id = r["_id"]
logger.info("Created classifier", classifier_id)
# create metrics for classifier
# require collection_id, classifier_id, document_ids and annotations ids
metrics_obj = {"collection_id": collection_id,
"classifier_id": classifier_id,
"documents": list(),
"annotations": list(),
"folds": list(),
"metrics": list()
}
metrics_resp = service.post("/metrics", json = metrics_obj)
# TODO: if it failed, roll back the created collection
if not metrics_resp.ok:
abort(metrics_resp.status_code, metrics_resp.content)
r = metrics_resp.json()
if r["_status"] != "OK":
abort(400, "Unable to create metrics")
metrics_id = r["_id"]
logger.info("Created metrics", metrics_id)
#create documents if CSV file was sent in
doc_ids = []
if posted_file != None:
docs = []
csvreader = csv.reader(posted_file)
first = True
headers = None
initial_has_annotated_dict = {}
for ann_id in collection["annotators"]:
initial_has_annotated_dict[ann_id] = False
for row in csvreader:
if len(row) == 0: continue # empty row
if first and csv_has_header:
headers = row
first = False
continue
metadata = {}
if csv_has_header:
if len(row)!=len(headers):
continue
for i in range(len(row)):
if i != csv_text_col:
metadata[headers[i]] = row[i]
doc = {
"creator_id": user_id,
"collection_id": collection_id,
"text": row[csv_text_col],
"metadata": metadata,
"has_annotated" : initial_has_annotated_dict
}
if random.random() < overlap:
doc["overlap"] = 1
else:
doc["overlap"] = 0
docs.append(doc)
doc_resp = service.post("/documents", json=docs)
# TODO if it failed, roll back the created collection and classifier
if not doc_resp.ok:
abort(doc_resp.status_code, doc_resp.content)
r = doc_resp.json()
# TODO if it failed, roll back the created collection and classifier
if r["_status"] != "OK":
abort(400, "Unable to create documents")
logger.info(r["_items"])
for obj in r["_items"]:
if obj["_status"] != "OK":
abort(400, "Unable to create documents")
doc_ids = [obj["_id"] for obj in r["_items"]]
logger.info("Added docs:", doc_ids)
# create next ids
(doc_ids, overlap_ids) = get_doc_and_overlap_ids(collection_id)
overlap_obj = {
"classifier_id": classifier_id,
"document_ids": doc_ids,
"overlap_document_ids": { ann_id: overlap_ids for ann_id in collection["annotators"] }
}
#for ann_id in collection["annotators"]:
# overlap_obj["overlap_document_ids"][ann_id] = overlap_ids
# TODO if it failed, roll back the created collection and classifier and documents
instances_response = service.post("/next_instances", json=overlap_obj)
if not instances_response.ok:
abort(instances_response.status_code, instances_response.content)
#post_items("next_instances", overlap_obj)
doc_ids.extend(overlap_ids)
# upload any image files
for image_file in image_files:
_upload_collection_image_file(collection_id, image_file.filename, image_file)
return service.convert_response(create_resp)
def _check_collection_and_get_image_dir(collection_id, path):
# make sure user can view collection
if not user_can_view_by_id(collection_id):
raise exceptions.Unauthorized()
image_dir = current_app.config["DOCUMENT_IMAGE_DIR"]
if image_dir == None or len(image_dir) == 0:
raise exceptions.InternalServerError()
# "static" is a special path that grabs images not associated with a particular collection
# this is mostly for testing
if not path.startswith("static/"):
image_dir = os.path.join(image_dir, "by_collection", collection_id)
return os.path.realpath(image_dir)
@bp.route("/image/<collection_id>/<path:path>", methods=["GET"])
@auth.login_required
def get_collection_image(collection_id, path):
image_dir = _check_collection_and_get_image_dir(collection_id, path)
return send_from_directory(image_dir, path)
@bp.route("/image_exists/<collection_id>/<path:path>", methods=["GET"])
@auth.login_required
def get_collection_image_exists(collection_id, path):
image_dir = _check_collection_and_get_image_dir(collection_id, path)
image_file = safe_join(image_dir, path)
return jsonify(os.path.isfile(image_file))
def _path_split(path): # I can't believe there's no os.path function to do this
dirs = []
while True:
(path, d) = os.path.split(path)
if d != "":
dirs.append(d)
else:
if path != "":
dirs.append(path)
break
dirs.reverse()
return dirs
def _safe_path(path):
# inspired by safe_filename() from werkzeug but allowing /'s
path = unicodedata.normalize("NFKD", path).encode("ascii", "ignore").decode("ascii")
if path.startswith("/"):
path = path[1:]
if path.endswith("/"):
path = path[0:-1]
return "/".join([p for p in _path_split(path) if p not in [".", "/", ".."]])
@bp.route("/can_add_documents_or_images/<collection_id>", methods=["GET"])
@auth.login_required
def get_user_can_add_documents_or_images(collection_id):
return jsonify(user_can_add_documents_or_images_by_id(collection_id))
def _upload_collection_image_file(collection_id, path, image_file):
# get filename on disk to save to
path = _safe_path(path)
image_dir = _check_collection_and_get_image_dir(collection_id, path)
image_filename = os.path.realpath(os.path.join(image_dir, path))
if not image_filename.startswith(image_dir): # this shouldn't happen but just to be sure
raise exceptions.BadRequest("Invalid path.")
if not os.path.isdir(os.path.dirname(image_filename)):
os.makedirs(os.path.dirname(image_filename))
# save image
image_file.save(image_filename)
return "/" + path
@bp.route("/image/<collection_id>/<path:path>", methods=["POST", "PUT"])
@auth.login_required
def post_collection_image(collection_id, path):
if not user_can_add_documents_or_images_by_id(collection_id):
raise exceptions.Unauthorized()
if "file" not in request.files:
raise exceptions.BadRequest("Missing file form part.")
return jsonify(_upload_collection_image_file(collection_id, path, request.files["file"]))
def init_app(app):
app.register_blueprint(bp)

View File

@@ -0,0 +1,30 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import os
# default configuration values
SECRET_KEY = "Cq13XII=%"
DEBUG = True
if os.environ.get("EVE_SERVER"):
EVE_SERVER = os.environ.get("EVE_SERVER")
elif os.environ.get("FLASK_ENV") and os.environ.get("FLASK_ENV").startswith("dev"):
EVE_SERVER = "http://localhost:5001"
else:
EVE_SERVER = "http://eve:7510"
if os.environ.get("REDIS_SERVER"):
REDIS_SERVER = os.environ.get("REDIS_SERVER")
elif os.environ.get("FLASK_ENV") and os.environ.get("FLASK_ENV").startswith("dev"):
REDIS_SERVER = "localhost"
else:
REDIS_SERVER = "redis"
REDIS_PORT = int(os.environ.get("REDIS_PORT", 6479))
AUTH_MODULE = os.environ.get("AUTH_MODULE", "vegas")
VEGAS_CLIENT_SECRET = os.environ.get("VEGAS_CLIENT_SECRET", None)
DOCUMENT_IMAGE_DIR = os.environ.get("DOCUMENT_IMAGE_DIR")

View File

@@ -0,0 +1,17 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from flask import request
from flask_cors import CORS
def not_found(e):
# because of CORS, we weren't properly getting the 404 because of the pre-flight OPTIONS
# so return 200 for pre-flight check and let the actual get/post/whatever get a 404 that can
# be read with CORS
if request.method.upper() == "OPTIONS":
return "Page not available but this is only a pre-flight check", 200
return e
def init_app(app):
CORS(app, supports_credentials = True)
app.register_error_handler(404, not_found)

View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

View File

@@ -0,0 +1,10 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from . import users
def init_app(app):
app.cli.add_command(users.print_users_command)
app.cli.add_command(users.set_user_password)
app.cli.add_command(users.add_admin_command)
app.cli.add_command(users.reset_user_passwords)

View File

@@ -0,0 +1,221 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import logging
import math
from pprint import pprint
import threading
from flask import abort, current_app, Response
import requests
logger = logging.getLogger(__name__)
class PerformanceHistory(object):
def __init__(self):
self.data = {
"get": {},
"post": {},
"put": {},
"delete": {},
"patch": {}
}
self.lock = threading.Lock()
def pprint(self):
self.lock.acquire()
try:
pprint(self.data)
finally:
self.lock.release()
def add(self, rest_type, path, response):
if rest_type not in self.data.keys():
raise ValueError("Invalid rest type {}".format(rest_type))
self.lock.acquire()
try:
data_type = _standardize_path(path)[0]
if data_type not in self.data[rest_type]:
self.data[rest_type][data_type] = {
"count": 0,
"response_content_size": 0,
"request_body_size": 0
}
p = self.data[rest_type][data_type]
p["count"] += 1
p["response_content_size"] += len(response.content)
if response.request.body:
p["request_body_size"] += len(response.request.body)
finally:
self.lock.release()
PERFORMANCE_HISTORY = PerformanceHistory()
def _standardize_path(path, *additional_paths):
if type(path) not in [list, tuple, set]:
path = [path]
if additional_paths:
path += additional_paths
# for every element in path, split by "/" into a list of paths, then remove empty values
# "/test" => ["test"], ["/test", "1"] => ["test", "1"], etc.
return [single_path for subpath in path for single_path in subpath.split("/") if single_path]
def url(path, *additional_paths):
"""Returns a complete URL for the given eve-relative path(s).
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param additional_paths: str[]: any additional paths to append
:return: str url
"""
return "/".join([current_app.config["EVE_SERVER"].strip("/")] +
_standardize_path(path, *additional_paths))
def where_params(where):
"""Returns a "where" parameters object that can be passed to eve.
Eve requires that dict parameters be serialized as JSON.
:param where: dict: dictionary of "where" params to pass to eve
:return: dict "where" params
"""
return params({"where": where})
def params(params):
"""Returns a parameters object that can be passed to eve.
Eve requires that dict parameters be serialized as JSON.
:param where: dict: dictionary of "where" params to pass to eve
:return: dict "where" params
"""
return {key: json.dumps(value) if type(value) == dict else value for (key, value) in params.items()}
def get(path, **kwargs):
"""Wraps requests.get for the given eve-relative path.
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param **kwargs: dict: any additional arguments to pass to requests.get
:return: requests.Response
"""
global PERFORMANCE_HISTORY
resp = requests.get(url(path), **kwargs)
PERFORMANCE_HISTORY.add("get", path, resp)
return resp
def post(path, **kwargs):
"""Wraps requests.post for the given eve-relative path.
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param **kwargs: dict: any additional arguments to pass to requests.post
:return: requests.Response
"""
global PERFORMANCE_HISTORY
resp = requests.post(url(path), **kwargs)
PERFORMANCE_HISTORY.add("post", path, resp)
return resp
def put(path, **kwargs):
"""Wraps requests.put for the given eve-relative path.
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param **kwargs: dict: any additional arguments to pass to requests.put
:return: requests.Response
"""
global PERFORMANCE_HISTORY
resp = requests.put(url(path), **kwargs)
PERFORMANCE_HISTORY.add("put", path, resp)
return resp
def delete(path, **kwargs):
"""Wraps requests.delete for the given eve-relative path.
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param **kwargs: dict: any additional arguments to pass to requests.delete
:return: requests.Response
"""
global PERFORMANCE_HISTORY
resp = requests.delete(url(path), **kwargs)
PERFORMANCE_HISTORY.add("delete", path, resp)
return resp
def patch(path, **kwargs):
"""Wraps requests.patch for the given eve-relative path.
:param path: str: eve-relative path (e.g. "collections" or ["collections", id])
:param **kwargs: dict: any additional arguments to pass to requests.patch
:return: requests.Response
"""
global PERFORMANCE_HISTORY
resp = requests.patch(url(path), **kwargs)
PERFORMANCE_HISTORY.add("patch", path, resp)
return resp
def get_items(path, params={}):
resp = get(path, params=params)
if not resp.ok:
abort(resp.status_code, resp.content)
return resp.json()["_items"]
def get_item_by_id(path, item_id, params={}):
resp = get([path, item_id], params=params)
if not resp.ok:
abort(resp.status_code, resp.content)
return resp.json()
def get_all_versions_of_item_by_id(path, item_id, params = {}):
params["version"] = "all"
resp = get([path, item_id], params=params)
if not resp.ok:
abort(resp.status_code, resp.content)
return resp.json()
def get_all_using_pagination(path, params):
resp = get(path, params=params)
if not resp.ok:
abort(resp.status_code)
body = resp.json()
if "_meta" not in body:
return body
all_items = {"_items": []}
all_items["_items"] += body["_items"]
page = body["_meta"]["page"]
total_pages = math.ceil(body["_meta"]["total"] / body["_meta"]["max_results"])
while page < total_pages:
page += 1
params["page"] = page
resp = get(path, params=params)
if not resp.ok:
abort(resp.status_code)
body = resp.json()
all_items["_items"] += body["_items"]
page = body["_meta"]["page"]
total_pages = math.ceil(body["_meta"]["total"] / body["_meta"]["max_results"])
return all_items
def convert_response(requests_response):
return Response(requests_response.content,
requests_response.status_code,
requests_response.raw.headers.items())
def remove_eve_fields(obj, remove_timestamps = True, remove_versions = True):
fields = ["_etag", "_links"]
if remove_timestamps: fields += ["_created", "_updated"]
if remove_versions: fields += ["_version", "_latest_version"]
for f in fields:
if f in obj:
del obj[f]
def remove_nonupdatable_fields(obj):
remove_eve_fields(obj)

View File

@@ -0,0 +1,114 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import click
from flask import abort
from flask.cli import with_appcontext
from . import service
from ..auth import password as authpassword
from .. import models
def get_all_users():
return service.get_items("/users")
def get_user(user_id):
return service.get_item_by_id("/users", user_id)
def get_user_by_email(email):
where = {
"email": email
}
users = service.get_items("/users", service.where_params(where))
if len(users) > 0:
return users[0]
else:
return None
def get_user_details(user_id):
user = get_user(user_id)
return models.UserDetails(first_name = user["firstname"], last_name = user["lastname"],
description = user["description"])
def update_user(user_id: str, details: models.UserDetails):
current = get_user(user_id)
if details.first_name != None:
current["firstname"] = details.first_name
if details.last_name != None:
current["lastname"] = details.last_name
if details.description != None:
current["description"] = details.description
etag = current["_etag"]
service.remove_nonupdatable_fields(current)
resp = service.put(["users", user_id], json = current, headers = {"If-Match": etag})
if not resp.ok:
abort(resp.status_code, resp.content)
return resp.json()
@click.command("print-users")
@with_appcontext
def print_users_command():
click.echo("Using data backend {}".format(service.url("")))
for user in get_all_users():
click.echo("* {} (uuid {})\n {}".format(user["email"], user["_id"], user["description"]))
@click.command("add-admin")
@click.argument("username")
@click.argument("password")
@with_appcontext
def add_admin_command(username, password):
user = {
"email": username,
"passwdhash": authpassword.hash_password(password),
"firstname": "New",
"lastname": "Administrator",
"description": "New administrator account.",
"role": ["administrator"]
}
resp = service.post("users", json = user)
if not resp.ok:
abort(resp.status_code)
def set_user_password_by_id(user_id, password):
user = get_user(user_id)
user["passwdhash"] = authpassword.hash_password(password)
etag = user["_etag"]
service.remove_nonupdatable_fields(user)
headers = {"If-Match": etag}
print("Putting to {}: {}".format(service.url(["users", user["_id"]]), user))
resp = service.put(["users", user["_id"]], json = user, headers = headers)
if not resp.ok:
abort(resp.status_code)
@click.command("set-user-password")
@click.argument("username")
@click.argument("password")
@with_appcontext
def set_user_password(username, password):
click.echo("Using data backend {}".format(service.url("")))
user = get_user_by_email(username)
user["passwdhash"] = authpassword.hash_password(password)
etag = user["_etag"]
service.remove_nonupdatable_fields(user)
headers = {"If-Match": etag}
click.echo("Putting to {}: {}".format(service.url(["users", user["_id"]]), user))
resp = service.put(["users", user["_id"]], json = user, headers = headers)
if not resp.ok:
click.echo("Failure! {}".format(resp))
else:
click.echo("Success!")
@click.command("reset-user-passwords")
@with_appcontext
def reset_user_passwords():
click.echo("Using data backend {}".format(service.url("")))
for user in get_all_users():
user["passwdhash"] = authpassword.hash_password(user["email"])
etag = user["_etag"]
service.remove_nonupdatable_fields(user)
headers = {"If-Match": etag}
click.echo("Putting to {}: {}".format(service.url("users", user["_id"]), user))
resp = service.put("users", user["_id"], json = user, headers = headers)
if not resp.ok:
click.echo("Failure! {}".format(resp))
else:
click.echo("Success!")

View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

View File

@@ -0,0 +1,133 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
from flask import abort, Blueprint, jsonify, request
from werkzeug import exceptions
from .. import auth, collections, log
from ..data import service
bp = Blueprint("documents", __name__, url_prefix = "/documents")
def _document_user_can_projection():
return service.params({"projection": {
"collection_id": 1
}})
def user_can_annotate(document):
return collections.user_can_annotate_by_id(document["collection_id"])
def user_can_view(document):
return collections.user_can_view_by_id(document["collection_id"])
def user_can_modify_metadata(document):
return collections.user_can_modify_document_metadata_by_id(document["collection_id"])
def user_can_annotate_by_id(document_id):
document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
return user_can_annotate(document)
def user_can_view_by_id(document_id):
document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
return user_can_view(document)
def user_can_modify_metadata_by_id(document_id):
document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
return user_can_modify_metadata(document)
@bp.route("/by_id/<doc_id>", methods = ["GET"])
@auth.login_required
def get_document(doc_id):
resp = service.get("documents/" + doc_id)
if not resp.ok:
abort(resp.status_code)
document = resp.json()
if user_can_view(document):
log.access_flask_view_document(document)
return service.convert_response(resp)
else:
raise exceptions.Unauthorized()
@bp.route("/by_collection_id/<col_id>", defaults = {"page": "all"}, methods = ["GET"])
@bp.route("/by_collection_id/<col_id>/<page>", methods = ["GET"])
@auth.login_required
def get_documents_in_collection(col_id, page):
truncate = json.loads(request.args.get("truncate", "true"))
truncate_length = json.loads(request.args.get("truncateLength", "50"))
collection = service.get_item_by_id("collections", col_id)
if not collections.user_can_view(collection):
raise exceptions.Unauthorized()
params = service.where_params({
"collection_id": col_id
})
if truncate:
params["projection"] = json.dumps({"metadata": 0})
params["truncate"] = truncate_length
if page == "all":
return jsonify(service.get_all_using_pagination("documents", params))
if page: params["page"] = page
resp = service.get("documents", params = params)
if not resp.ok:
abort(resp.status_code, resp.content)
data = resp.json()
if truncate:
for document in data["_items"]:
document["text"] = document["text"][0:truncate_length]
return jsonify(data)
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
def add_document():
document = request.get_json()
if not document or "collection_id" not in document or not document["collection_id"]:
raise exceptions.BadRequest()
if not collections.user_can_add_documents_or_images_by_id(document["collection_id"]):
raise exceptions.Unauthorized()
resp = service.post("documents", json=document)
if resp.ok:
log.access_flask_add_document(resp.json())
return service.convert_response(resp)
@bp.route("/can_annotate/<doc_id>", methods = ["GET"])
@auth.login_required
def can_annotate_document(doc_id):
return jsonify(user_can_annotate_by_id(doc_id))
@bp.route("/can_modify_metadata/<doc_id>", methods = ["GET"])
@auth.login_required
def can_modify_metadata(doc_id):
return jsonify(user_can_modify_metadata_by_id(doc_id))
@bp.route("/metadata/<doc_id>", methods = ["PUT"])
def update_metadata(doc_id):
metadata = request.get_json()
if not metadata:
raise exceptions.BadRequest("Missing body JSON.")
# get document
document = service.get_item_by_id("documents", doc_id, params={
"projection": json.dumps({
"metadata": 1,
"collection_id": 1
})
})
if not user_can_modify_metadata(document):
raise exceptions.Unauthorized()
# update document
if "metadata" in document:
document["metadata"].update(metadata)
else:
document["metadata"] = metadata
headers = {"If-Match": document["_etag"]}
service.remove_nonupdatable_fields(document)
return service.convert_response(
service.patch(["documents", doc_id], json = document, headers = headers))
def init_app(app):
app.register_blueprint(bp)

View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

View File

@@ -0,0 +1,521 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# **********************************************************************
# Copyright (C) 2019 Johns Hopkins University Applied Physics Laboratory
#
# All Rights Reserved.
# This material may only be used, modified, or reproduced by or for the
# U.S. government pursuant to the license rights granted under FAR
# clause 52.227-14 or DFARS clauses 252.227-7013/7014.
# For any other permission, please contact the Legal Office at JHU/APL.
# **********************************************************************
import asyncio
import atexit
import concurrent
import json
import logging
import os
import platform
import secrets
import time
import uuid
from concurrent.futures import ThreadPoolExecutor
from datetime import datetime, timedelta
from threading import Thread
import aioredis
import pydash
import redis
from pebble import ThreadPool
from ..shared.config import ConfigBuilder
config = ConfigBuilder.get_config()
logger = logging.getLogger(__name__)
class ServiceManager(object):
# Redis Client
logger.info("Using redis host {}:{}".format(config.REDIS_HOST, config.REDIS_PORT))
r_pool = redis.ConnectionPool(host=config.REDIS_HOST, port=config.REDIS_PORT, decode_responses=True)
r_conn = redis.StrictRedis(connection_pool=r_pool, charset="utf-8", decode_responses=True)
# Redis Key Prefixes
redis_key_prefix = config.REDIS_PREFIX + "registration:"
redis_reg_key_prefix = redis_key_prefix + "codex:"
redis_channels_key = redis_key_prefix + "channels"
redis_channel_ttl_key_prefix = redis_key_prefix + "channel-ttl:" # not really ttl, but more like registered date
redis_work_queue_key_prefix = redis_key_prefix + "work-queue:"
redis_work_mutex_key_prefix = redis_key_prefix + "work-mutex:"
redis_handler_mutex_key_prefix = redis_key_prefix + "handler-mutex:"
# Redis Key TTL (Used by Expire Calls)
# this is how long a registered-service will live unless it get's an update
redis_reg_key_ttl = int(timedelta(seconds=config.SCHEDULER_REGISTRATION_TIMEOUT).total_seconds())
redis_channels_key_ttl = int(timedelta(minutes=60).total_seconds()) # do not touch (this is how long the overall list of channels will live)
redis_work_queue_key_ttl = int(timedelta(seconds=config.SCHEDULER_QUEUE_TIMEOUT).total_seconds()) # how long a job can sit in the queue before it expires
# Redis Mutex Key TTL (Use by locks)
redis_work_mutex_key_ttl = int(timedelta(minutes=10).total_seconds()) # how long can this mutex be acquired
redis_handler_mutex_key_ttl = int(timedelta(seconds=5).total_seconds()) # how long should the same handler for the same job be locked from running...
# Timeouts
# how long a handler should take (may not kill it in all platforms if expired because it's a thread)
handler_timeout = int(timedelta(seconds=config.SCHEDULER_HANDLER_TIMEOUT).total_seconds())
# Worker Names
registration_worker_name = "registration_worker"
processing_worker_name = "processing_worker"
channel_worker_name = "channel_worker"
# Channel Names
shutdown_channel = "shutdown"
registration_channel = "registration"
# Reserved Channels
reserved_channels = frozenset({shutdown_channel, registration_channel})
def __init__(self, default_handler=None):
"""
:type default_handler: callable
"""
self.shutdown_key = secrets.token_hex(16) # 32-char thread-local shutdown secret (prevents external service from shutting down listeners)
self.aio_loop = asyncio.new_event_loop() # this event loop is mean to use in non-main thread
self.registered_channels = set() # type: set[str]
self.is_running = False # flag to use to quickly determine if system has started
self.pubsubs = dict() # type: dict[str, redis.client.PubSub] # keep's all the thread-local pubsubs
self.workers = dict() # type: dict[str, Thread] # keep's all the worker threads
self.handlers = dict() # type: dict[str, callable] # handler dictionary for future dynamic hanlders
self.handlers["default"] = default_handler if callable(default_handler) else lambda: None # register default handler
# make sure things are stopped properly
atexit.register(self.stop_listeners)
@classmethod
def get_registered_channels(cls, include_ttl=False):
"""
Get list of registered channels, with registration time if requested.
:type include_ttl: bool
:rtype: list[str] | dict[str, datetime]
"""
registered_channels = cls.r_conn.smembers(cls.redis_channels_key)
if not include_ttl:
return list(registered_channels)
with cls.r_conn.pipeline() as pipe:
for channel in registered_channels:
pipe.get(cls.redis_channel_ttl_key_prefix + channel)
values = pipe.execute()
final_values = []
for redis_val in values:
try:
channel_ttl_int = pydash.parse_int(redis_val, 10)
redis_val = datetime.utcfromtimestamp(channel_ttl_int)
except (ValueError, TypeError):
continue
final_values.append(redis_val)
return dict(zip(registered_channels, final_values))
@classmethod
def get_registered_service_details(cls, service_name=None):
"""
Get registration details of a service.
:type service_name: str
:rtype: None | dict
"""
if not isinstance(service_name, str):
return None
service_details = cls.r_conn.get(cls.redis_reg_key_prefix + service_name)
try:
service_details = json.loads(service_details)
except (json.JSONDecodeError, TypeError):
logger.warning("Unable to decode service details")
return None
return service_details
@classmethod
def get_registered_services(cls, include_details=True):
"""
Get list of registered services and registration body if requested.
:type include_details: bool
:rtype: list[str] | list[dict]
"""
registered_keys = []
registered_names = []
for key in cls.r_conn.scan_iter(match=cls.redis_reg_key_prefix + "*"):
service_name = key.rsplit(":", 1)[-1] # "<prefix>:registration:codex:<name>" --> [<prefix>:registration, <name>]
registered_names.append(service_name)
registered_keys.append(key)
if not include_details:
return registered_names
with cls.r_conn.pipeline() as pipe:
for key in registered_keys:
pipe.get(key)
values = pipe.execute()
final_values = []
for redis_val in values:
try:
final_values.append(json.loads(redis_val))
except (json.JSONDecodeError, TypeError):
continue
return final_values
@classmethod
def send_service_request(cls, service_name, data, job_id=None, encoder=None):
"""
Queue's a job for the requested service.
:type service_name: str
:type data: dict
:type job_id: str
:type encoder: json.JSONEncoder
:rtype: None | dict
"""
registered_services = cls.get_registered_services(include_details=False)
if service_name not in set(registered_services):
logger.warning("Unable to retrieve service.")
return None
service_details = cls.r_conn.get(cls.redis_reg_key_prefix + service_name)
if not service_details:
logger.warning("Unable to retrieve service details.")
return None
try:
service_details = json.loads(service_details)
except (json.JSONDecodeError, TypeError):
logger.warning("Unable to load service details.")
return None
service_channel = pydash.get(service_details, "channel", None)
if not isinstance(service_channel, str):
logger.warning("Unable to load service details.")
return None
redis_queue_key = cls.redis_work_queue_key_prefix + service_name
job_id_to_use = job_id if isinstance(job_id, str) else uuid.uuid4().hex
request_body = {"job_id": job_id_to_use, "job_type": "request", "job_queue": redis_queue_key}
request_body_publish = json.dumps(request_body.copy(), separators=(",", ":"), cls=encoder)
request_body.update({"job_data": data})
try:
request_body_queue = json.dumps(request_body, separators=(",", ":"), cls=encoder)
except (json.JSONDecodeError, TypeError):
logger.warning("Unable to encode data.")
return None
with cls.r_conn.pipeline() as pipe:
pipe.rpush(redis_queue_key, request_body_queue)
# if nothing is processed in "redis_queue_key_ttl" minutes since last insert, the queue will be deleted
pipe.expire(redis_queue_key, cls.redis_work_queue_key_ttl)
num_pushed, expire_is_set = pipe.execute()
if num_pushed < 1 or expire_is_set is not True:
return None
num_received = cls.r_conn.publish(service_channel, request_body_publish)
# no-one consumed it
if num_received <= 0:
# TODO: potential race-condition here due to popping potentially wrong thing in multi-process (we should use a lock)
cls.r_conn.rpop(redis_queue_key)
return None
return request_body
def start_listeners(self):
"""
Starts all the workers.
"""
if not self.is_running:
reg_started = self._start_registration_listener()
proc_started = self._start_processing_listeners()
self.is_running = reg_started and proc_started
def stop_listeners(self):
"""
Stops all the workers.
"""
self.r_conn.publish(self.shutdown_channel, "exit_%s" % self.shutdown_key)
self.aio_loop.call_soon_threadsafe(self._stop_channel_watchdog)
for t_id, thread in self.workers.items():
if isinstance(thread, Thread):
thread.join()
self.is_running = False
def _start_registration_listener(self):
"""
Starts the registration worker.
:rtype: bool
"""
if self.registration_worker_name in self.workers:
return
worker = Thread(target=self._registration_listener)
worker.setDaemon(True) # Make sure it exits when main program exits
worker.setName("NLP Redis Registration Listener")
worker.start() # Start the thread
self.workers[self.registration_worker_name] = worker
logger.debug("registration worker started")
return worker.is_alive()
def _start_processing_listeners(self):
"""
Starts the processing workers.
:rtype: bool
"""
if self.processing_worker_name in self.workers:
return
worker = Thread(target=self._processing_listener)
aio_worker = Thread(target=self._start_channel_watchdog)
worker.setDaemon(True) # Make sure it exits when main program exits
aio_worker.setDaemon(True)
worker.setName("NLP Redis Processing Listener")
aio_worker.setName("NLP Redis Channel Listener")
worker.start() # Start the thread
aio_worker.start()
self.workers[self.processing_worker_name] = worker
self.workers[self.channel_worker_name] = aio_worker
return worker.is_alive() and aio_worker.is_alive()
def _start_channel_watchdog(self):
"""
Starts the channel watchdog workers in an asyncio-only thread. It monitors the channel TTL's.
:rtype: bool
"""
asyncio.set_event_loop(self.aio_loop)
try:
self.aio_loop.run_until_complete(self._channel_watchdog())
except asyncio.CancelledError:
pass
def _stop_channel_watchdog(self):
"""
Stops the channel watchdog workers in an asyncio-only thread.
:rtype: bool
"""
asyncio.set_event_loop(self.aio_loop)
tasks = list(asyncio.Task.all_tasks(self.aio_loop)) # type: ignore
tasks_to_cancel = {t for t in tasks if not t.done()}
for task in tasks_to_cancel: # type: asyncio.Task
task.cancel()
def _registration_listener(self):
"""
Registration Listener Implementation.
"""
logger.debug("Starting Registration Listener")
pubsub = self.r_conn.pubsub(ignore_subscribe_messages=True)
pubsub.subscribe(self.registration_channel, self.shutdown_channel)
self.pubsubs[self.registration_worker_name] = pubsub
for msg in pubsub.listen():
valid_shutdown_channel = "channel" in msg and "data" in msg and msg["channel"] == self.shutdown_channel
valid_register_channel = "channel" in msg and "data" in msg and msg["channel"] == self.registration_channel
# Skip message if invalid channel (shouldn't happen)
if not (valid_register_channel or valid_shutdown_channel):
continue
# Exit if shutdown message is received... (with private key)
if valid_shutdown_channel:
if msg["data"] == "exit_%s" % self.shutdown_key:
logger.debug("Exiting Registration Listener")
return
else:
continue
# Parse Message...
try:
registration_body = json.loads(msg["data"])
except (json.JSONDecodeError, TypeError) as e:
logger.error("Invalid Registration Message Format")
continue
# Extract Message Details
name = pydash.get(registration_body, "name", None)
version = pydash.get(registration_body, "version", None)
channel = pydash.get(registration_body, "channel", None)
framework = pydash.get(registration_body, "service.framework", None)
framework_types = pydash.get(registration_body, "service.types", None)
# Validate Message Types
var_list = (name, version, channel, framework, framework_types,)
type_list = (str, str, str, str, list,)
type_check = (type(val) == type_list[idx] for idx, val in enumerate(var_list))
if not all(type_check):
logger.warning("Invalid Registration Body Format %s %s %s %s %s", name, version, channel, framework, framework_types)
continue
types_validation = (isinstance(val, str) for val in framework_types)
if not all(types_validation):
logger.warning("Invalid Registration Body Types")
continue
# Verify that the channel in the registration data is not reserved... (to prevent hijacking of a channel)
if channel in self.reserved_channels:
logger.warning("Channel Name %s is reserved.", channel)
continue
# redis key to track registration information
redis_reg_data = dict(name=name, version=version, channel=channel, framework=framework, framework_types=framework_types)
redis_reg_data_str = json.dumps(redis_reg_data, separators=(",", ":"))
redis_reg_key = self.redis_reg_key_prefix + name # e.g. prefix:registration:codex:name
redis_reg_key_ttl = self.redis_reg_key_ttl
# redis key to track all registered channels
redis_channels_key = self.redis_channels_key
redis_channels_key_ttl = self.redis_channels_key_ttl
# redis key to track channel expiration from list of registered channels
redis_channel_ttl_track_key = self.redis_channel_ttl_key_prefix + channel
try:
# Schema after this
# "<prefix>:registration:channels" --> SET of all channels
# "<prefix>:registration:channel-ttl:<channel_name>" <-- holds the dates the channels were added (to be able to remove them if expired)
# "<prefix>:registration:codex:<service_name>" <-- holds registration info
with self.r_conn.pipeline() as pipe:
# nothing wrong with setting the same key more than once, it's the TTL that matters...
pipe.setex(redis_reg_key, redis_reg_key_ttl, redis_reg_data_str)
pipe.sadd(redis_channels_key, channel)
pipe.expire(redis_channels_key, redis_channels_key_ttl)
pipe.setex(redis_channel_ttl_track_key, redis_reg_key_ttl, int(time.time()))
pipe.execute()
self.registered_channels.add(channel)
except redis.RedisError as e:
logger.warning("Unable to register channel.")
continue
async def _channel_watchdog(self):
"""
Channel Watchdog Implementation.
In asyncio, it monitors the channel-ttl keys and the channels SET to expire registered services as needed.
The other functionality is to register new channels as they're added to the pubsub in the processing listener.
"""
redis_aio_pool = None
try:
logger.debug("Starting Channel Watchdog")
redis_aio_pool = await aioredis.create_redis_pool(address=(config.REDIS_HOST, config.REDIS_PORT,),
db=config.REDIS_DBNUM,
encoding="UTF-8",
loop=self.aio_loop)
redis_reg_key_ttl_timedelta = timedelta(seconds=self.redis_reg_key_ttl)
while True:
# get list of channels
channels = await redis_aio_pool.smembers(self.redis_channels_key)
# for each channel, verify when it was added
for channel in channels:
channel_ttl = await redis_aio_pool.get(self.redis_channel_ttl_key_prefix + channel)
if not isinstance(channel_ttl, str):
continue
# parse dates
try:
channel_ttl_int = pydash.parse_int(channel_ttl, 10)
channel_ttl_date = datetime.utcfromtimestamp(channel_ttl_int)
except (ValueError, TypeError):
continue
# if expired, remove!
if channel_ttl_date + redis_reg_key_ttl_timedelta < datetime.utcnow():
self.registered_channels.remove(channel)
await redis_aio_pool.srem(self.redis_channels_key, channel)
# verify list of channels again
channels = await redis_aio_pool.smembers(self.redis_channels_key)
# add any remaining channels to the channels-to-be-monitored
self.registered_channels.update(channels)
has_valid_pubsub = self.processing_worker_name in self.pubsubs and isinstance(self.pubsubs[self.processing_worker_name], redis.client.PubSub)
if has_valid_pubsub and len(channels) > 0:
# this is fine because of the python's GIL...
self.pubsubs[self.processing_worker_name].subscribe(*channels)
await asyncio.sleep(1) # being nice
except asyncio.CancelledError as e:
logger.debug("Terminating Channel Watchdog")
raise # re-raise is needed to safely terminate
except aioredis.errors.RedisError:
logger.error("Redis Error on Channel Watchdog")
finally:
if redis_aio_pool:
redis_aio_pool.close()
await redis_aio_pool.wait_closed()
@staticmethod
def _thread_killer(thread_id):
# Inspired by http://tomerfiliba.com/recipes/Thread2/
# Only works on Cpython distribution of python as it uses the pthread api directly...
# Will also not work if the thread doesn't acquire the GIL again (e.g. it's stuck in a System Call... like open)
if platform.python_implementation() != "CPython":
logger.critical("Unable to kill thread due to platform implementation. Memory Leak may occur...")
return False
import ctypes
thread_id = ctypes.c_long(thread_id)
exception = ctypes.py_object(SystemExit)
set_count = ctypes.pythonapi.PyThreadState_SetAsyncExc(thread_id, exception)
if set_count == 0:
logger.critical("Failed to set exception in thread %s. Invalid Thread ID. Memory Leak may occur...", thread_id.value)
return False
elif set_count > 1:
logger.critical("Exception was set in multiple threads with id %s. Trying to undo. Memory Leak may occur...", thread_id.value)
ctypes.pythonapi.PyThreadState_SetAsyncExc(thread_id, ctypes.c_long(0))
return False
return True
def _processing_listener_handler_wrapper(self, job_id, job_type, job_queue, job_data):
with self.r_conn.lock(self.redis_work_mutex_key_prefix + job_id, timeout=self.redis_work_mutex_key_ttl):
has_handler_lock = self.r_conn.exists(self.redis_handler_mutex_key_prefix + job_id)
if not has_handler_lock:
self.r_conn.setex(self.redis_handler_mutex_key_prefix + job_id, self.redis_handler_mutex_key_ttl, job_id)
else:
return
service_name = job_queue.rsplit(":", 1)[-1] # "<prefix>:registration:codex:<name>" --> [<prefix>:registration, <name>]
job_info = self.get_registered_service_details(service_name) # job registration info
if not job_info:
self.r_conn.delete(self.redis_handler_mutex_key_prefix + job_id)
return
logger.debug("Received new job to process with id %s by process %s", job_id, os.getpid())
with ThreadPool(max_workers=1) as executor:
future = executor.schedule(self.handlers["default"], args=(job_id, job_type, job_info, job_data))
try:
future.result(timeout=self.handler_timeout)
logger.debug("Job with id %s finished by process %s", job_id, os.getpid())
except concurrent.futures.TimeoutError:
logger.warning("Timeout occurred in Job with id %s by process %s", job_id, os.getpid())
for t in executor._pool_manager.workers: # type: Thread
thread_id = t.ident
if t.is_alive():
logger.warning("Attempting to kill thread with id %s timeout occurred in process %s", thread_id, os.getpid())
is_killed = self._thread_killer(thread_id)
if is_killed:
logger.debug("Successfully killed thread with id %s in process %s", thread_id, os.getpid())
finally:
self.r_conn.delete(self.redis_handler_mutex_key_prefix + job_id)
def _processing_listener(self):
"""
Processing Listener Implementation.
Runs a handler when a processing message gets send over an already registered channel.
"""
logger.debug("Starting Processing Listener")
pubsub = self.r_conn.pubsub(ignore_subscribe_messages=True)
pubsub.subscribe(self.shutdown_channel)
self.pubsubs[self.processing_worker_name] = pubsub
for msg in pubsub.listen():
valid_shutdown_channel = "channel" in msg and "data" in msg and msg["channel"] == self.shutdown_channel
valid_registered_channel = "channel" in msg and "data" in msg and msg["channel"] in self.registered_channels
# this may happen since we're registering channels dynamically (in the watchdog)
if not (valid_registered_channel or valid_shutdown_channel):
continue
# verify if this thread should exit
if valid_shutdown_channel:
if msg["data"] == "exit_%s" % self.shutdown_key:
logger.debug("Exiting Processing Listener")
return
else:
continue
# parse message
try:
channel_body = json.loads(msg["data"])
except (json.JSONDecodeError, TypeError) as e:
logger.error("Invalid Body Message Format")
continue
# get specific details about the message
# it needs to have at least a job_id and a job_type (request, response)
# TODO: create classes for these fields...
job_id = pydash.get(channel_body, "job_id", None)
job_type = pydash.get(channel_body, "job_type", None)
job_queue = pydash.get(channel_body, "job_queue", None)
job_data = pydash.get(channel_body, "job_data", None)
# this side only handle's "response" types, the services handle the "request"
is_valid_msg = isinstance(job_id, str) and job_type == "response" and isinstance(job_queue, str)
if not is_valid_msg:
continue
# Process Job
self._processing_listener_handler_wrapper(job_id, job_type, job_queue, job_data)
if __name__ == '__main__':
sm = ServiceManager()
sm.start_listeners()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
sm.stop_listeners()
pass

105
backend/pine/backend/log.py Normal file
View File

@@ -0,0 +1,105 @@
import enum
import json
import logging.config
import os
# make sure this package has been installed
import pythonjsonlogger
CONFIG_FILE_ENV = "PINE_LOGGING_CONFIG_FILE"
ACCESS_LOGGER_NAME = "pine.access"
ACCESS_LOGGER = None
class Action(enum.Enum):
LOGIN = enum.auto()
LOGOUT = enum.auto()
CREATE_COLLECTION = enum.auto()
VIEW_DOCUMENT = enum.auto()
ADD_DOCUMENT = enum.auto()
ANNOTATE_DOCUMENT = enum.auto()
def setup_logging():
if CONFIG_FILE_ENV not in os.environ:
return
file = os.environ[CONFIG_FILE_ENV]
if os.path.isfile(file):
with open(file, "r") as f:
logging.config.dictConfig(json.load(f))
logging.getLogger(__name__).info("Set logging configuration from file {}".format(file))
def get_flask_request_info():
from flask import request
info = {
"ip": request.remote_addr,
"path": request.full_path
}
ua = request.headers.get("User-Agent", None)
if ua: info["user-agent"] = ua
return info
def get_flask_logged_in_user():
from .auth import bp
user = bp.get_logged_in_user()
return {
"id": user["id"],
"username": user["username"]
}
###############
def access_flask_login():
access(Action.LOGIN, get_flask_logged_in_user(), get_flask_request_info(), None)
def access_flask_logout(user):
access(Action.LOGOUT, {"id": user["id"], "username": user["username"]}, get_flask_request_info(), None)
def access_flask_add_collection(collection):
extra_info = {
"collection_id": collection["_id"]
}
if "metadata" in collection:
extra_info["collection_metadata"] = collection["metadata"]
for k in list(extra_info["collection_metadata"].keys()):
if not extra_info["collection_metadata"][k]:
del extra_info["collection_metadata"][k]
access(Action.CREATE_COLLECTION, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_view_document(document):
extra_info = {
"document_id": document["_id"]
}
if "metadata" in document:
extra_info["document_metadata"] = document["metadata"]
access(Action.VIEW_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_add_document(document):
extra_info = {
"document_id": document["_id"]
}
if "metadata" in document:
extra_info["document_metadata"] = document["metadata"]
access(Action.ADD_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
def access_flask_annotate_document(document, annotation):
extra_info = {
"document_id": document["_id"],
"annotation_id": annotation["_id"]
}
if "metadata" in document:
extra_info["document_metadata"] = document["metadata"]
access(Action.ANNOTATE_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
###############
def access(action, user, request_info, message, **extra_info):
global ACCESS_LOGGER
if not ACCESS_LOGGER:
ACCESS_LOGGER = logging.getLogger(ACCESS_LOGGER_NAME)
m = {
"user": user,
"action": action.name,
"request": request_info
}
if message: m["message"] = message
ACCESS_LOGGER.critical(m, extra=extra_info)

View File

@@ -0,0 +1,81 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import abc
import enum
# see models/login.ts in frontend
class LoginFormFieldType(enum.Enum):
TEXT = "text"
PASSWORD = "password"
class LoginFormField(object):
def __init__(self, name: str, display: str, field_type: LoginFormFieldType):
self.name = name
self.display = display
self.type = field_type
def to_dict(self):
return {
"name": self.name,
"display": self.display,
"type": self.type.value
}
class LoginForm(object):
def __init__(self, fields: list, button_text: str):
self.fields = fields
self.button_text = button_text
def to_dict(self):
return {
"fields": [field.to_dict() for field in self.fields],
"button_text": self.button_text
}
# see models/user.ts in frontend
class AuthUser(object):
def __init__(self):
pass
@property
@abc.abstractmethod
def id(self):
pass
@property
@abc.abstractmethod
def username(self):
pass
@property
@abc.abstractmethod
def display_name(self):
pass
@property
@abc.abstractmethod
def is_admin(self):
pass
def to_dict(self):
return {
"id": self.id,
"username": self.username,
"display_name": self.display_name,
"is_admin": self.is_admin
}
class UserDetails(object):
def __init__(self, first_name, last_name, description):
self.first_name = first_name
self.last_name = last_name
self.description = description
def to_dict(self):
return self.__dict__

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,3 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from ..data import service

View File

@@ -0,0 +1,57 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import logging
from flask import abort, Blueprint, jsonify, request
import requests
from werkzeug import exceptions
from .. import auth
from ..data import bp as data
from ..data import service
from ..pineiaa.bratiaa import iaa_service
logger = logging.getLogger(__name__)
bp = Blueprint("iaa_reports", __name__, url_prefix = "/iaa_reports")
def get_current_report(collection_id):
where = {
"collection_id": collection_id,
}
reports = service.get_items("/iaa_reports", service.where_params(where))
if len(reports) > 0:
return reports[0]
else:
return None
@bp.route("/by_collection_id/<collection_id>", methods = ["GET"])
@auth.login_required
def get_iia_report_by_collection_id(collection_id):
where = {
"collection_id": collection_id
}
resp = service.get("iaa_reports", params=service.where_params(where))
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
@bp.route("/by_collection_id/<collection_id>", methods=["POST"])
@auth.login_required
def create_iaa_report_by_collection_id(collection_id):
new_report = iaa_service.getIAAReportForCollection(collection_id)
if new_report:
current_report = get_current_report(collection_id)
if current_report != None:
headers = {"If-Match": current_report["_etag"]}
return jsonify(service.patch(["iaa_reports", current_report["_id"]], json = new_report, headers = headers).ok)
else:
return jsonify(service.post("iaa_reports", json = new_report).ok)
else:
return jsonify("ok")
def init_app(app):
app.register_blueprint(bp)

View File

@@ -0,0 +1,6 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from ..bratiaa.agree import compute_f1_agreement, iaa_report, AnnFile, F1Agreement, Document, input_generator
from ..bratiaa.evaluation import exact_match_instance_evaluation, exact_match_token_evaluation, Annotation
from ..bratiaa.utils import tokenize
from .. import service

View File

@@ -0,0 +1,276 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import logging
from itertools import combinations
import matplotlib.pyplot as plt
import numpy as np
from functools import partial
from scipy.special import comb
from tabulate import tabulate
from ..bratiaa.evaluation import *
from ..bratiaa.utils import TokenOverlap
from collections import namedtuple, Counter
Annotation = namedtuple('Annotation', ['type', 'label', 'offsets'])
AnnFile = namedtuple('AnnFile', ['annotator_id', 'annotations'])
# attempted division by zero is expected and unproblematic -> NaN
np.seterr(divide='ignore', invalid='ignore')
LOGGER = logging.getLogger(__name__)
class Document:
__slots__ = ['ann_files', 'txt', 'doc_id']
def __init__(self, txt, doc_id):
self.txt = txt
self.doc_id = doc_id
self.ann_files = []
def input_generator(json_list):
LOGGER.debug(json_list)
annotators = set()
documents = []
#expects keys text and annotators
#annotations: {annotator, [[label or start, stop, label]]}
for id, ann_doc in enumerate(json_list):
anns_people = []
#Annotation = namedtuple('Annotation', ['type', 'label', 'offsets'])
doc = Document(ann_doc['text'], ann_doc['_id'])
for ann_name, ann_list in ann_doc['annotations'].items():
anns = []
for ann in ann_list:
if len(ann)!=3:
continue
#Evaluation requires a tuple of a tuple of start,stop
anns.append(Annotation('T', ann[2], tuple([(ann[0], ann[1])])))
doc.ann_files.append(AnnFile(ann_name, anns))
annotators.add(ann_name)
documents.append(doc)
return sorted(list(annotators)), documents
def compute_f1(tp, fp, fn):
return (2 * tp) / (2 * tp + fp + fn)
class F1Agreement:
def __init__(self, annotators, documents, labels, eval_func=exact_match_instance_evaluation, token_func=None):
assert len(annotators) > 1, 'At least two annotators are necessary to compute agreement!'
num_pairs = comb(len(annotators), 2, exact=True)
# (p, d, c, l) where p := annotator pairs, d := documents, c := counts (tp, fp, fn), l := labels
self._pdcl = np.zeros((num_pairs, len(documents), 3, len(labels)))
self._documents = list(documents)
self._doc2idx = {d: i for i, d in enumerate(documents)}
self._labels = list(labels)
self._label2idx = {l: i for i, l in enumerate(labels)}
self._annotators = list(annotators)
self._pairs = [pair for pair in combinations(annotators, 2)]
self._pair2idx = {p: i for i, p in enumerate(self._pairs)}
# add pairs in reverse order (same index)
for (a1, a2), value in self._pair2idx.copy().items():
self._pair2idx[(a2, a1)] = value
self._eval_func = eval_func # function used to extract true positives, false positives and false negatives
self._token_func = token_func # function used for tokenization
self._compute_tp_fp_fn(documents)
@property
def annotators(self):
return list(self._annotators)
@property
def documents(self):
return list(self._documents)
@property
def labels(self):
return list(self._labels)
def _compute_tp_fp_fn(self, documents):
for doc_index, document in enumerate(documents):
assert doc_index < len(self._documents), 'Input generator yields more documents than expected!'
to = None
if self._token_func:
text = document.txt
tokens = list(self._token_func(text))
to = TokenOverlap(text, tokens)
for anno_file_1, anno_file_2 in combinations(document.ann_files, 2):
tp, fp, fn = self._eval_func(anno_file_1.annotations, anno_file_2.annotations, tokens=to)
pair_idx = self._pair2idx[(anno_file_1.annotator_id, anno_file_2.annotator_id)]
doc_idx = self._doc2idx[document]
self._increment_counts(tp, pair_idx, doc_idx, 0)
self._increment_counts(fp, pair_idx, doc_idx, 1)
self._increment_counts(fn, pair_idx, doc_idx, 2)
def _increment_counts(self, annotations, pair, doc, kind):
for a in annotations:
try:
self._pdcl[pair][doc][kind][self._label2idx[a.label]] += 1
except KeyError:
logging.error(
f'Encountered unknown label "{a.label}"! Please make sure that your "annotation.conf" '
f'(https://brat.nlplab.org/configuration.html#annotation-configuration) '
f'is located under the project root and contains an exhaustive list of entities!'
)
raise
def mean_sd_per_label(self):
"""
Mean and standard deviation of all annotator combinations' F1 scores by label.
"""
pcl = np.sum(self._pdcl, axis=1) # sum over documents
f1_pairs = compute_f1(pcl[:, 0], pcl[:, 1], pcl[:, 2])
avg, stddev = self._mean_sd(f1_pairs)
return avg, stddev
def mean_sd_per_document(self):
"""
Mean and standard deviation of all annotator combinations' F1 scores per document.
"""
pdc = np.sum(self._pdcl, axis=3) # sum over labels
f1_pairs = compute_f1(pdc[:, :, 0], pdc[:, :, 1], pdc[:, :, 2])
avg, stddev = self._mean_sd(f1_pairs)
return avg, stddev
def mean_sd_total(self):
"""
Mean and standard deviation of all annotator cominations' F1 scores.
"""
pc = np.sum(self._pdcl, axis=(1, 3)) # sum over documents and labels
f1_pairs = compute_f1(pc[:, 0], pc[:, 1], pc[:, 2])
avg, stddev = self._mean_sd(f1_pairs)
return avg, stddev
def mean_sd_per_label_one_vs_rest(self, annotator):
"""
Mean and standard deviation of all annotator combinations' F1 scores involving given annotator per label.
"""
pcl = np.sum(self._pdcl, axis=1) # sum over documents
pcl = pcl[self._pairs_involving(annotator)]
f1_pairs = compute_f1(pcl[:, 0], pcl[:, 1], pcl[:, 2])
avg, stddev = self._mean_sd(f1_pairs)
return avg, stddev
def mean_sd_total_one_vs_rest(self, annotator):
"""
Mean and standard deviation of all annotator combinations' F1 scores involving given annotator.
"""
pc = np.sum(self._pdcl, axis=(1, 3)) # sum over documents and labels
pc = pc[self._pairs_involving(annotator)]
f1_pairs = compute_f1(pc[:, 0], pc[:, 1], pc[:, 2])
if len(f1_pairs) > 1:
avg, stddev = self._mean_sd(f1_pairs)
else:
avg, stddev = f1_pairs, 0
return avg, stddev
def _pairs_involving(self, annotator):
return [self._pair2idx[(a1, a2)] for (a1, a2) in self._pairs if
a1 == annotator or a2 == annotator]
@staticmethod
def _mean_sd(f1_pairs):
"""
Mean and standard deviation along first axis.
"""
f1 = np.mean(f1_pairs, axis=0)
f1_std = np.std(f1_pairs, axis=0)
return f1, f1_std
@staticmethod
def print_table(row_label_header, row_labels, avg, stddev, precision=3):
stats = np.stack((row_labels, avg, stddev)).transpose()
headers = [row_label_header, 'Mean F1', 'SD F1']
LOGGER.debug(tabulate(stats, headers=headers, tablefmt='github', floatfmt=f'.{precision}f'))
def compute_total_f1_matrix(self):
"""
Returns (n x n) matrix, where n is the number of annotators, containing
pair-wise total F1 scores between all annotators.
By definition, the matrix is symmetric and F1 = 1 on the main diagonal.
"""
pc = np.sum(self._pdcl, axis=(1, 3)) # sum over documents and labels
f1_pairs = compute_f1(pc[:, 0], pc[:, 1], pc[:, 2])
num_annotators = len(self._annotators)
f1_matrix = np.zeros((num_annotators, num_annotators))
annotator2idx = {a: i for i, a in enumerate(self._annotators)}
for ann1, ann2 in self._pairs:
ann1_idx, ann2_idx = annotator2idx[ann1], annotator2idx[ann2]
f1_matrix[ann1_idx][ann2_idx] = f1_pairs[self._pair2idx[(ann1, ann2)]]
f1_matrix[ann2_idx][ann1_idx] = f1_pairs[self._pair2idx[(ann2, ann1)]]
# perfect diagonal by definition
for i in range(len(self._annotators)):
f1_matrix[i][i] = 1
return f1_matrix
def draw_heatmap(self, out_path):
"""
Draws heatmap based on square matrix of F1 scores.
"""
matrix = self.compute_total_f1_matrix()
fig, ax = plt.subplots()
im = ax.imshow(matrix)
# ticks and labels
num_annotators = len(self._annotators)
ax.set_xticks(np.arange(num_annotators))
ax.set_yticks(np.arange(num_annotators))
# show complete grid
ax.set_ylim(-0.5, num_annotators - 0.5)
ax.set_ylim(num_annotators - 0.5, -0.5)
ax.set_xticklabels(self._annotators)
ax.set_yticklabels(self._annotators)
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
rotation_mode="anchor")
for i in range(len(self._annotators)):
for j in range(len(self._annotators)):
ax.text(j, i, f'{matrix[i, j]:.2f}', ha="center", va="center", color="w")
# color bar
cbar = ax.figure.colorbar(im, ax=ax)
cbar.ax.set_ylabel('F1 score', rotation=-90, va="bottom")
fig.tight_layout()
plt.savefig(out_path)
def compute_f1_agreement(annotators, documents, labels, token_func=None, eval_func=None):
if not eval_func:
eval_func = exact_match_instance_evaluation
if token_func:
eval_func = exact_match_token_evaluation
#input_gen = partial(input_gen, project_root)
#annotators, documents = _collect_annotators_and_documents(input_gen)
return F1Agreement(annotators, documents, sorted(labels), eval_func=eval_func, token_func=token_func)
def iaa_report(f1_agreement, precision=3):
agreement_type = '* Instance-based F1 agreement'
if f1_agreement._token_func:
agreement_type = '* Token-based F1 agreement'
LOGGER.debug(f'# Inter-Annotator Agreement Report\n')
LOGGER.debug(agreement_type)
LOGGER.debug('\n## Project Setup\n')
LOGGER.debug(f'* {len(f1_agreement.annotators)} annotators: {", ".join(f1_agreement.annotators)}')
LOGGER.debug(f'* {len(f1_agreement.documents)} agreement documents')
LOGGER.debug(f'* {len(f1_agreement.labels)} labels')
docids = [d.doc_id for d in f1_agreement.documents]
LOGGER.debug('\n## Agreement per Document\n')
#f1_agreement.print_table('Document', f1_agreement.documents, *f1_agreement.mean_sd_per_document(), precision=precision)
f1_agreement.print_table('Document', docids, *f1_agreement.mean_sd_per_document(), precision=precision)
LOGGER.debug('\n## Agreement per Label\n')
f1_agreement.print_table('Label', f1_agreement.labels, *f1_agreement.mean_sd_per_label(), precision=precision)
LOGGER.debug('\n## Overall Agreement\n')
avg, stddev = f1_agreement.mean_sd_total()
LOGGER.debug(f'* Mean F1: {avg:.{precision}f}, SD F1: {stddev:.{precision}f}\n')

View File

@@ -0,0 +1,50 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import logging
import argparse
from ..bratiaa.agree import iaa_report, compute_f1_agreement
from ..bratiaa.utils import tokenize
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('project_root',
help='Root directory of the Brat annotation project')
parser.add_argument('--heatmap',
help='Output path for F1-agreement heatmap',
dest='heatmap_path')
parser.add_argument('-p', '--precision',
help='Precision of results (number of digits following the decimal point)',
dest='precision',
default=3)
parser.add_argument('-s', '--silent',
help='Set log level on ERROR',
action='store_true')
parser.add_argument('-t', '--tokenize',
help='Token-based evaluation (tokenizer splits on whitespace)',
action='store_true')
return parser.parse_args()
def main():
args = parse_args()
log_level = logging.WARNING
if args.silent:
log_level = logging.ERROR
logging.basicConfig(level=log_level,
format='%(asctime)s - %(levelname)s - %(message)s')
token_func = None
if args.tokenize:
token_func = tokenize
f1_agreement = compute_f1_agreement(args.project_root, token_func=token_func)
iaa_report(f1_agreement, args.precision)
if args.heatmap_path:
f1_agreement.draw_heatmap(args.heatmap_path)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,48 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""
Functions for computing the difference between two sets of annotations.
"""
from collections import namedtuple, Counter
Annotation = namedtuple('Annotation', ['type', 'label', 'offsets'])
def exact_match_instance_evaluation(ann_list_1, ann_list2, tokens=None):
exp = set(ann_list_1)
pred = set(ann_list_2)
tp = exp.intersection(pred)
fp = pred.difference(exp)
fn = exp.difference(pred)
return tp, fp, fn
def exact_match_token_evaluation(ann_list_1, ann_list_2, tokens=None):
"""
Annotations are split into token-sized bits before true positives, false positives and false negatives are computed.
Sub-token annotations are expanded to full tokens. Long annotations will influence the results more than short
annotations. Boundary errors for adjacent annotations with the same label are ignored!
"""
exp = Counter(_read_token_annotations(ann_list_1, tokens))
pred = Counter(_read_token_annotations(ann_list_2, tokens))
tp = counter2list(exp & pred)
fp = counter2list(pred - exp)
fn = counter2list(exp - pred)
return tp, fp, fn
def counter2list(c):
for elem, cnt in c.items():
for i in range(cnt):
yield (elem)
def _read_token_annotations(ann_list, tokens):
"""
Yields a new annotation for each token overlapping with an annotation. If annotations are overlapping each other,
there will be multiple annotations for a single token.
"""
for annotation in set(ann_list):
for start, end in annotation.offsets:
for ts, te in tokens.overlapping_tokens(start, end):
yield Annotation(annotation.type, annotation.label, ((ts, te),))

View File

@@ -0,0 +1,153 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import requests
from ..bratiaa import iaa_report, compute_f1_agreement, input_generator
from ..bratiaa import tokenize
from ..bratiaa import exact_match_token_evaluation
from collections import defaultdict
from .. import service
import numpy as np
EVE_HEADERS = {'Content-Type': 'application/json'}
def get_items(resource):
response = service.get(resource, headers=EVE_HEADERS)
if response.status_code == 200:
r = response.json()
if '_items' in r:
if '_links' in r and 'next' in r['_links']:
return r['_items'], r['_links']['next']['href']
else:
return r['_items'], None
return [], None
def get_all_items(query):
total_items = []
while True:
items, query = get_items(query)
total_items.extend(items)
if query is None:
break
return total_items
def get_doc_annotations(collection_id, exclude=None):
#get documents
resource = 'documents?where={"collection_id":"%s"}' % (collection_id)
docs = get_all_items(resource)
#get annotations
resource = 'annotations?where={"collection_id":"%s"}' % (collection_id)
annotations = get_all_items(resource)
combined = {}
for d in docs:
combined[d['_id']]={"_id":d['_id'], "text":d['text'], "annotations":{}}
if "metadata" in d:
combined[d['_id']]['metadata'] = d['metadata']
for a in annotations:
creator = a['creator_id']
if exclude and creator in exclude :
continue
docid = a['document_id']
anns = a['annotation']
if docid in combined:
combined[docid]["annotations"][creator]=anns
return combined
def fix_num_for_json(number):
if np.isnan(number):
return "null"
else:
return number
def getIAAReportForCollection(collection_id):
combined = get_doc_annotations(collection_id) ## exclude=set(['bchee1'])
labels = set()
for v in combined.values():
for ann_list in v['annotations'].values():
for ann in ann_list:
if len(ann) == 3:
labels.add(ann[2])
anns = []
for k, c in combined.items():
anns.append(c)
token_func = tokenize
eval_func = exact_match_token_evaluation
labels = list(labels)
annotators, documents = input_generator(anns)
try:
f1_agreement = compute_f1_agreement(annotators, documents, labels, eval_func=eval_func, token_func=token_func)
# Get label counts by provider
counts = defaultdict(lambda: defaultdict(int))
for document in anns:
for per, ann_list in document['annotations'].items():
for a in ann_list:
if len(a) == 3:
counts[per][a[2]] += 1
# for label in labels:
# print(label)
# for person, label_counts in counts.items():
# if label in label_counts:
# print('\t', person, label_counts[label])
docids = [d.doc_id for d in f1_agreement.documents]
list_per_doc = []
mean_sd_per_doc = f1_agreement.mean_sd_per_document()
for index, docID in enumerate(docids):
list_per_doc.append({"doc_id": docID, "avg": fix_num_for_json(mean_sd_per_doc[0][index]),
"stddev": fix_num_for_json(mean_sd_per_doc[1][index])})
list_per_label = []
mean_sd_per_label = f1_agreement.mean_sd_per_label()
for index, label in enumerate(f1_agreement.labels):
list_per_label.append({"label": label, "avg": fix_num_for_json(mean_sd_per_label[0][index]),
"stddev": fix_num_for_json(mean_sd_per_label[1][index])})
labels_per_annotator_dict = {}
for item in dict(counts).items():
labels_per_annotator_dict[item[0]] = dict(item[1])
# save iaa report
new_iaa_report = {
"collection_id": collection_id,
"num_of_annotators": len(f1_agreement.annotators),
"num_of_agreement_docs": len(f1_agreement.documents),
"num_of_labels": len(f1_agreement.labels),
"per_doc_agreement": list_per_doc,
"per_label_agreement": list_per_label,
"overall_agreement": {"mean": f1_agreement.mean_sd_total()[0], "sd": f1_agreement.mean_sd_total()[1],
"heatmap_data": {"matrix": list(
map(lambda x: list(x), list(f1_agreement.compute_total_f1_matrix()))),
"annotators": list(f1_agreement.annotators)}},
"labels_per_annotator": labels_per_annotator_dict
}
return new_iaa_report
except AssertionError:
#There's no annotations from different annotators to calculate iaa for
return None

View File

@@ -0,0 +1,55 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import re
import numpy as np
ENCODING = 'utf-8'
TOKEN = re.compile(r'\S+')
def tokenize(text):
for match in re.finditer(TOKEN, text):
yield match.start(), match.end()
def read(path, encoding=ENCODING, newline='\r\n', mode='r'):
with open(path, newline=newline, encoding=encoding, mode=mode) as fin:
return fin.read()
class TokenOverlap:
"""
Data structure for quick lookup of tokens overlapping with given span.
Assumes that the provided list of tokens is sorted by indices!
"""
def __init__(self, text, tokens):
self.tokens = tokens
self.char2token = self.compute_mapping(len(text), tokens)
@staticmethod
def compute_mapping(text_length, tokens):
char2token = np.zeros(text_length, dtype=int)
i = 0
for token_idx, (start, end) in enumerate(tokens):
char2token[i:start] = token_idx - 1
char2token[start:end] = token_idx
i = end
char2token[i:text_length] = len(tokens) - 1
return char2token
def overlapping_tokens(self, start, end):
assert end <= len(self.char2token), f'End index {end} > text length {len(self.char2token)}!'
if end < 1 or start >= end:
return []
start_token = self.char2token[start]
if start_token == -1: # start offset before first token
start_token = 0
if self.tokens[start_token][1] <= start: # start offset between two tokens
start_token += 1
end_token = self.char2token[end - 1] # end offset is exclusive
if end_token < 0 or end_token < start_token:
return []
return self.tokens[start_token:end_token + 1]

View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

View File

@@ -0,0 +1,257 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import json
import logging
import pydash
import random
import uuid
from flask import abort, Blueprint, jsonify, request
from werkzeug import exceptions
from .. import auth
from ..data import service
from ..collections import bp as collectionsbp
from ..job_manager.service import ServiceManager
logger = logging.getLogger(__name__)
service_manager = ServiceManager()
bp = Blueprint("pipelines", __name__, url_prefix = "/pipelines")
# Cache classifiers and overlap so we don't need to keep on making queries
# we don't need to invalidate cache because we don't invalidate classifiers
classifier_dict = {}
classifier_pipelines = {}
@bp.route("/", strict_slashes = False, methods = ["GET"])
@auth.login_required
def get_pipelines():
resp = service.get("pipelines")
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
@bp.route("/by_id/<pipeline_id>", methods = ["GET"])
def get_pipeline_by_id(pipeline_id):
resp = service.get("pipelines/" + pipeline_id)
return service.convert_response(resp)
def _get_collection_classifier(collection_id):
where = {
"collection_id": collection_id
}
classifiers = service.get_items("/classifiers", params=service.where_params(where))
if len(classifiers) != 1:
raise exceptions.BadRequest(description="Expected one classifier but found {}.".format(len(classifiers)))
return classifiers[0]
@bp.route("/classifiers/by_collection_id/<collection_id>", methods=["GET"])
@auth.login_required
def get_collection_classifier(collection_id):
return jsonify(_get_collection_classifier(collection_id))
def _get_classifier_metrics(classifier_id):
where = {
"classifier_id": classifier_id
}
metrics = service.get_items("/metrics", params=service.where_params(where))
logger.info(metrics)
if len(metrics) != 1:
raise exceptions.BadRequest(description="Expected one metric but found {}.".format(len(metrics)))
all_metrics = service.get_all_versions_of_item_by_id("/metrics", metrics[0]["_id"])
logger.info(all_metrics)
return all_metrics
def _get_collection_classifier(collection_id):
where = {
"collection_id": collection_id
}
classifiers = service.get_items("/classifiers", params=service.where_params(where))
if len(classifiers) != 1:
raise exceptions.BadRequest(description="Expected one classifier but found {}.".format(len(classifiers)))
return classifiers[0]
@bp.route("/metrics", methods=["GET"])
@auth.login_required
def get_metrics():
resp = service.get("metrics")
if not resp.ok:
abort(resp.status_code)
return service.convert_response(resp)
@bp.route("/metrics/by_classifier_id/<classifier_id>", methods=["GET"])
# @auth.login_required
def get_classifier_metrics(classifier_id):
return jsonify(_get_classifier_metrics(classifier_id))
def _get_classifier(classifier_id):
classifier = service.get_item_by_id("/classifiers", classifier_id)
if classifier is None:
return False
else:
pipeline = service.get_item_by_id("/pipelines", classifier["pipeline_id"])
if pipeline is None:
return False
else:
classifier_dict[classifier_id] = classifier
classifier_pipelines[classifier_id] = pipeline["name"].lower()
return True
def _get_next_instance(classifier_id):
if classifier_id not in classifier_dict:
if not _get_classifier(classifier_id):
raise exceptions.NotFound(description = "Classifier not found: could not load classifier.")
items = service.get_items("/next_instances", service.where_params({"classifier_id": classifier_id}))
# r = requests.get(ENTRY_POINT + '/next_instances?where={"classifier_id":"' + classifier_id + '"}',
# headers=EVE_HEADERS)
# items = get_items_from_response(r)["_items"]
if len(items) == 0:
raise exceptions.NotFound("No next instances")
return items[0]
@bp.route("/next_document/by_classifier_id/<classifier_id>", methods = ["GET"])
@auth.login_required
def get_next_by_classifier(classifier_id):
instance = _get_next_instance(classifier_id)
user_id = auth.get_logged_in_user()["id"]
if user_id not in instance["overlap_document_ids"]:
print("new user: adding to overlap document ids")
instance["overlap_document_ids"][user_id] = collectionsbp.get_overlap_ids(classifier_dict[classifier_id]["collection_id"])
to_patch = {
"_id": instance["_id"],
"overlap_document_ids": instance["overlap_document_ids"]
}
headers = {"If-Match": instance["_etag"]}
r = service.patch(["next_instances", instance["_id"]], json = to_patch, headers = headers)
if not r.ok:
abort(r.status_code, r.content)
if random.random() <= classifier_dict[classifier_id]["overlap"] and len(instance["overlap_document_ids"][user_id]) > 0:
return jsonify(instance["overlap_document_ids"][user_id].pop())
elif len(instance["document_ids"]) > 0:
return jsonify(instance["document_ids"].pop())
else:
return jsonify(None)
@bp.route("/next_document/by_classifier_id/<classifier_id>/<document_id>", methods = ["POST"])
@auth.login_required
def advance_to_next_document_by_classifier(classifier_id, document_id):
user_id = auth.get_logged_in_user()["id"]
# get stored next data
if classifier_id not in classifier_dict:
if not _get_classifier(classifier_id):
raise exceptions.NotFound(description="Classifier not found: could not load classifier.")
else:
# reset classifier_dict for classifier, cached classifiers are getting out of sync
del classifier_dict[classifier_id]
if not _get_classifier(classifier_id):
raise exceptions.NotFound(description="Classifier not found: could not load classifier.")
logger.debug(classifier_dict[classifier_id])
pipeline = pydash.get(classifier_pipelines, classifier_id, None)
if pipeline is None:
return jsonify("Error, pipeline not found"), 500
instance = _get_next_instance(classifier_id)
data = None
trained = False
request_body = None
if document_id in instance["overlap_document_ids"][user_id]:
instance["overlap_document_ids"][user_id].remove(document_id)
data = {"overlap_document_ids": instance["overlap_document_ids"]}
elif document_id in instance["document_ids"]:
instance["document_ids"].remove(document_id)
data = {"document_ids": instance["document_ids"]}
else:
logger.info("Document {} not found in instance, document already annotated".format(document_id))
if data is not None:
headers = {"If-Match": instance["_etag"]}
r = service.patch(["next_instances", instance["_id"]], json = data, headers = headers)
#r = requests.patch(ENTRY_POINT + "/next_instances/" + items[0]["_id"], json.dumps(data), headers=headers)
if not r.ok:
abort(r.status_code, r.content)
classifier_dict[classifier_id]["annotated_document_count"] += 1
headers = {"If-Match": classifier_dict[classifier_id]["_etag"]}
service.remove_nonupdatable_fields(classifier_dict[classifier_id])
r = service.patch(["classifiers", classifier_id], json=classifier_dict[classifier_id], headers=headers)
if not r.ok:
abort(r.status_code, r.content)
del classifier_dict[classifier_id]
if not _get_classifier(classifier_id):
raise exceptions.NotFound(description="Classifier not found: could not load classifier.")
if classifier_dict[classifier_id]["annotated_document_count"] % classifier_dict[classifier_id]["train_every"] == 0:
## Check to see if we should update classifier
## Add to work queue to update classifier - queue is pipeline_id
## ADD TO PUBSUB {classifier_id} to reload
logger.info("training")
job_data = {'classifier_id': classifier_id, 'pipeline': pipeline, 'type': 'fit', 'framework': pipeline,
'model_name': 'auto-trained'}
request_body = service_manager.send_service_request(pipeline, job_data)
trained = True
return jsonify({"success": True, "trained": trained, "body": request_body})
@bp.route("/predict", methods=["POST"])
@auth.login_required
def predict():
try:
input_json = request.get_json()
classifier_id = input_json["classifier_id"]
documents = input_json["documents"]
docids = input_json["document_ids"]
except Exception as e:
abort(400, "Error parsing input", custom="Input JSON could not be read:" + str(e))
if classifier_id not in classifier_dict:
if not _get_classifier(classifier_id):
raise exceptions.NotFound(description = "Classifier not found: could not load classifier.")
pipeline_id = classifier_dict[classifier_id]["pipeline_id"]
##enqueue documents and ids - CHECK THIS MAY NOT WORK!
key = "{}:result:{}".format(pipeline_id, uuid.uuid4())
service_manager.send_service_request(pipeline_id, json.dumps({"type": "predict", "documents": documents,
"document_ids": docids, "classifier_id": classifier_id,
"response_key": key}))
# alternatively use session key but might need more work to integrate with authentication, nginx
# session["predict_result_key"] = key
return jsonify({"response_key": key})
@bp.route("/train", methods=["POST"])
def test_redis():
# try:
input_json = request.get_json()
classifier_id = pydash.get(input_json, "classifier_id", None)
# get classifier and pipeline
if classifier_id not in classifier_dict:
if not _get_classifier(classifier_id):
return jsonify("Error, classifier not found"), 500
pipeline = pydash.get(classifier_pipelines, classifier_id, None)
if pipeline is None:
return jsonify("Error, pipeline not found"), 500
model_name = pydash.get(input_json, "model_name", None)
logger.info(service_manager.get_registered_channels())
job_data = {'classifier_id': classifier_id, 'pipeline': pipeline, 'type': 'fit', 'framework': pipeline,
'model_name': model_name}
request_body = service_manager.send_service_request(pipeline, job_data)
return jsonify({"request": request_body})
def init_app(app):
service_manager.start_listeners()
app.register_blueprint(bp)

View File

@@ -0,0 +1,11 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# **********************************************************************
# Copyright (C) 2018 Johns Hopkins University Applied Physics Laboratory
#
# All Rights Reserved.
# This material may only be used, modified, or reproduced by or for the
# U.S. government pursuant to the license rights granted under FAR
# clause 52.227-14 or DFARS clauses 252.227-7013/7014.
# For any other permission, please contact the Legal Office at JHU/APL.
# **********************************************************************

View File

@@ -0,0 +1,398 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# **********************************************************************
# Copyright (C) 2018 Johns Hopkins University Applied Physics Laboratory
#
# All Rights Reserved.
# This material may only be used, modified, or reproduced by or for the
# U.S. government pursuant to the license rights granted under FAR
# clause 52.227-14 or DFARS clauses 252.227-7013/7014.
# For any other permission, please contact the Legal Office at JHU/APL.
# **********************************************************************
import inspect
import json
import logging
import os
import pathlib
from argparse import ArgumentParser
from datetime import timedelta
from importlib.util import find_spec
import six
LOGGER = logging.getLogger(__name__)
class BaseConfig(object):
# # # # GLOBAL PARAMETERS - ONLY EXTEND - # # # #
ROOT_DIR = os.path.normpath(os.path.join(os.path.abspath(os.path.dirname(__file__)), os.pardir))
BASE_DIR = os.path.normpath(os.path.join(os.path.abspath(os.path.dirname(__file__)), os.pardir))
BASE_CFG_FILE = "config.yaml"
BASE_ENV_PREFIX = "AL_"
# # # # GLOBAL PARAMETERS - ONLY EXTEND - # # # #
# # # ENVIRONMENT PARAMS # # #
DEBUG = True
TESTING = False
LOGGER_NAME = BASE_ENV_PREFIX + "LOGGER"
LOGGER_DIR = "logs"
LOGGER_FILE = "debug.log"
LOGGER_LEVEL = logging.DEBUG
# EVE
EVE_HOST = "localhost"
EVE_PORT = 5001
# Redis
REDIS_HOST = "localhost"
REDIS_PORT = 6379 # was 6380?
REDIS_USR = None
REDIS_PWD = None
REDIS_DBNUM = 0 # Redis DB Number (0-5)
REDIS_PREFIX = "AL:" # Note: Must be used manually
REDIS_EXPIRE = 3600
# Scheduler
SCHEDULER_REGISTRATION_TIMEOUT = timedelta(minutes=10).seconds # how long something will be registered as a service
SCHEDULER_HANDLER_TIMEOUT = timedelta(minutes=10).seconds # how long it should take to process a service response
SCHEDULER_QUEUE_TIMEOUT = timedelta(
minutes=60).seconds # how long a job can sit in the queue before it expires (e.g. client did not consume)
# New Services
SERVICE_REGISTRATION_CHANNEL = "registration"
SERVICE_REGISTRATION_FREQUENCY = 60 # unit: seconds
SERVICE_LISTENING_FREQUENCY = 1 # unit: seconds
SERVICE_HANDLER_TIMEOUT = 3600 # unit: seconds
SERVICE_LIST = [
dict(
name="service_corenlp",
version="1.0",
channel="service_corenlp",
service=dict(
framework="corenlp",
types=["fit", "predict"]
)
),
dict(
name="service_opennlp",
version="1.0",
channel="service_opennlp",
service=dict(
framework="opennlp",
types=["fit", "predict"]
)
),
dict(
name="service_spacy",
version="1.0",
channel="service_spacy",
service=dict(
framework="spacy",
types=["fit", "predict"]
)
)
]
# Datasets
DATASETS_LOCAL_DIR = r"D:\Data\DEEPCATT2_DB_DATA\datasets" # Only works on production
# Models
MODELS_LOCAL_DIR = r"D:\\Data\\DEEPCATT2_DB_DATA\\Models" # Only works on production
def __init__(self, root_dir=None):
# Default are already loaded at this Point
if root_dir is not None:
self._process_paths(root_dir)
# Read in environment vars
self._process_file_cfg() # Load Config File First
self._process_env_vars() # Process Env Vars Last
if getattr(self, "TESTING", False) is True:
LOGGER.info("Environment<" + self.__class__.__name__ + ">: " + self.as_dict().__str__())
@classmethod
def _get_config_var_paths(cls, root_dict=None):
return_dict = dict()
root_obj = root_dict if root_dict is not None else cls.as_dict()
def traverse_dict(dict_obj, _path=None):
if _path is None:
_path = []
for _key, _val in six.iteritems(dict_obj):
next_path = _path + [_key]
if isinstance(_val, dict):
for _dict in traverse_dict(_val, next_path):
yield _dict
else:
yield next_path, _val
for path, val in traverse_dict(root_obj):
return_dict[".".join(path)] = val
return return_dict
@classmethod
def _process_paths(cls, alt_path=None):
norm_alt_path = os.path.normpath(os.path.abspath(alt_path))
old_root_path = cls.ROOT_DIR
if not os.path.exists(alt_path):
return
white_list = frozenset({"ROOT_DIR", "BASE_DIR"})
for key, value in six.iteritems(cls.as_dict()):
if not isinstance(value, str):
continue
if key in white_list:
continue
if value.startswith(old_root_path) and hasattr(cls, key):
fixed_path = value.replace(old_root_path, norm_alt_path)
setattr(cls, key, fixed_path)
continue
setattr(cls, "ROOT_DIR", norm_alt_path)
setattr(cls, "BASE_DIR", norm_alt_path)
@classmethod
def _process_file_cfg(cls):
try:
from ruamel.yaml import YAML
except ImportError:
LOGGER.debug("YAML Parsing not Available")
return
yaml = YAML(typ="unsafe", pure=True)
yaml_path = os.path.join(cls.BASE_DIR, cls.BASE_CFG_FILE)
yaml_pathlib = pathlib.Path(yaml_path)
if not (yaml_pathlib.exists() or yaml_pathlib.is_file()):
LOGGER.debug("YAML Config File not Available")
return
config_dict = yaml.load(yaml_pathlib) # type: dict
config_dict = config_dict if isinstance(config_dict, dict) else dict()
config_dict_cls = cls.as_dict()
white_list = ("ROOT_DIR", "BASE_DIR", "BASE_CFG_FILE")
if find_spec("pydash"):
import pydash
for path, value in six.iteritems(config_dict):
upper_key = str.upper(path)
if upper_key not in white_list and pydash.has(config_dict_cls, upper_key):
setattr(cls, upper_key, value)
else:
for key, value in six.iteritems(config_dict):
upper_key = str.upper(key)
if upper_key in config_dict_cls and upper_key not in white_list:
setattr(cls, upper_key, value)
LOGGER.info("YAML Config %s was Loaded" % yaml_path)
@classmethod
def _process_env_vars(cls):
for key, value in six.iteritems(cls.as_dict()):
is_bool = isinstance(value, bool)
is_int = isinstance(value, six.integer_types)
is_flt = isinstance(value, float)
is_arr = isinstance(value, list)
is_dict = isinstance(value, dict)
is_str = isinstance(value, str)
if hasattr(cls, key):
# Prefix with DH to avoid potential conflicts
# Default is the value in the config if not found
environment_var = os.getenv(cls.BASE_ENV_PREFIX + key, value)
# default was used
if environment_var == value:
continue
# special case
# TODO: Special cases aren't special enough to break the rules
if environment_var == "null":
environment_var = None
setattr(cls, key, environment_var)
continue
# first on purpose
if is_bool:
environment_var = cls._str2bool(environment_var, value)
elif is_int:
environment_var = cls._try_cast(environment_var, int, value)
elif is_flt:
environment_var = cls._try_cast(environment_var, float, value)
elif is_arr or is_dict:
try:
temp = json.loads(environment_var)
except json.decoder.JSONDecodeError as e:
LOGGER.error("Invalid Format: " + key, e)
continue
if is_arr and isinstance(temp, list):
environment_var = temp
elif is_dict and isinstance(temp, dict):
environment_var = temp
else:
LOGGER.error("Invalid Format: " + key)
continue
# last on purpose
elif not is_str and value is not None:
LOGGER.error("Environment variable %s not supported" % key, environment_var)
continue
setattr(cls, key, environment_var)
@classmethod
def as_dict(cls):
"""
:rtype: dict
"""
# Doing this because of inheritance chain
member_list = inspect.getmembers(cls, lambda x: not (inspect.isroutine(x)))
attribute_list = [mem for mem in member_list if not (mem[0].startswith("__") and mem[0].endswith("__"))]
return dict(attribute_list)
@classmethod
def as_attr_dict(cls):
"""
:rtype: munch.Munch | dict
"""
try:
import munch
return munch.munchify(cls.as_dict())
except ImportError:
LOGGER.error("Attribute Dict not Available")
return cls.as_dict()
@staticmethod
def _try_cast(value, _type, _default=None):
try:
return _type(value)
except (ValueError, TypeError):
return _default
@staticmethod
def _str2bool(_str, _default=None):
if isinstance(_str, bool):
return _str
elif isinstance(_str, str):
return _str.lower() in ("true", "1")
else:
return _default
class TestConfig(BaseConfig):
pass
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# DO NOT TOUCH THIS, IT HAS TO BE AT THE END OF THIS FILE
# This configure the config to load on import when running your code.
class ConfigBuilder(object):
__env_cfg_variable = "BUILDER_CFG_PROFILE"
__current_config_instance = None
__current_config_instance_name = None
__current_config_instance_print = False
__arg_parser = ArgumentParser(add_help=False)
__arg_parser.add_argument("-c", "--config", metavar="\"Config Name...\"", help="Configuration Setup", type=str, default=None)
def __init__(self):
# Placeholder
pass
@staticmethod
def __get_configs():
"""
:rtype list[Callable[..., BaseConfig]]
"""
all_subclasses = []
def get_all_subclasses(klass):
"""
:type klass: Callable[..., BaseConfig]
"""
for subclass in klass.__subclasses__():
all_subclasses.append(subclass)
get_all_subclasses(subclass)
get_all_subclasses(BaseConfig)
return all_subclasses
@staticmethod
def get_config_names():
"""
:rtype: list[str]
"""
return [klass.__name__ for klass in ConfigBuilder.__get_configs()]
@classmethod
def get_arg_parser(cls):
"""
:rtype: ArgumentParser
"""
return cls.__arg_parser
@classmethod
def init_config(cls, config_name=None, config_base=None, enable_terminal=True, as_attr_dict=True):
"""
:type config_name: str | None
:type config_base: str | None
:type enable_terminal: bool
:type as_attr_dict: bool
"""
if cls.__current_config_instance is None:
cls.__current_config_instance = cls.get_config(config_name, config_base, enable_terminal, as_attr_dict)
@classmethod
def get_config(cls, config_name=None, config_base=None, enable_terminal=True, as_attr_dict=True):
"""
:type config_name: str | None
:type config_base: str | None
:type enable_terminal: bool
:type as_attr_dict: bool
:rtype: BaseConfig
"""
if cls.__current_config_instance is not None and not config_name:
if not cls.__current_config_instance_print:
LOGGER.debug("Reusing config instance \"" + str(cls.__current_config_instance_name) + "\"")
cls.__current_config_instance_print = True
return ConfigBuilder.__current_config_instance
if enable_terminal is True:
terminal_config = getattr(cls.__parse_terminal_config(), "config", None)
config_name = terminal_config if terminal_config is not None else config_name
# Check if there's a config profile as an env variable
config_name = os.getenv(cls.__env_cfg_variable, config_name)
if config_name is None:
LOGGER.debug("Using \"BaseConfig\"...")
config_klass = BaseConfig(root_dir=config_base)
cls.__current_config_instance = config_klass.as_attr_dict() if as_attr_dict else config_klass
cls.__current_config_instance_name = config_klass.__class__.__name__
return cls.__current_config_instance
for klass in cls.__get_configs():
if klass.__name__ == config_name:
LOGGER.debug("Using \"" + config_name + "\" Config...")
config_klass = klass(root_dir=config_base)
cls.__current_config_instance = config_klass.as_attr_dict() if as_attr_dict else config_klass
cls.__current_config_instance_name = config_klass.__class__.__name__
return cls.__current_config_instance
LOGGER.debug("Config Provided Not Found, Using \"BaseConfig\"...")
config_klass = BaseConfig(root_dir=config_base)
cls.__current_config_instance = config_klass.as_attr_dict() if as_attr_dict else config_klass
cls.__current_config_instance_name = config_klass.__class__.__name__
return cls.__current_config_instance
@classmethod
def set_config(cls, config_name=None, config_base=None, enable_terminal=True, as_attr_dict=True):
"""
:type config_name: str | None
:type config_base: str | None
:type enable_terminal: bool
:type as_attr_dict: bool
"""
if cls.__current_config_instance is None:
return
else:
cls.__current_config_instance = None
cls.__current_config_instance = cls.get_config(config_name, config_base, enable_terminal, as_attr_dict)
@classmethod
def __parse_terminal_config(cls):
"""
:rtype: argparse.Namespace
"""
return cls.__arg_parser.parse_known_args()[0]
if __name__ == "__main__":
pass

View File

@@ -0,0 +1,37 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# **********************************************************************
# Copyright (C) 2018 Johns Hopkins University Applied Physics Laboratory
#
# All Rights Reserved.
# This material may only be used, modified, or reproduced by or for the
# U.S. government pursuant to the license rights granted under FAR
# clause 52.227-14 or DFARS clauses 252.227-7013/7014.
# For any other permission, please contact the Legal Office at JHU/APL.
# **********************************************************************
import inspect
def transform_module_by_config(module_ref, config_ref, config_prefix=None):
"""
Transforms a given module's properties based on ConfigBuilder Values.
The prefix can be used to avoid blindy changing values and target a subset of matching values in config_ref.
:type module_ref: ModuleType
:type config_ref: dict
:type config_prefix: str
"""
if not inspect.ismodule(module_ref):
return
config_prefix = config_prefix if isinstance(config_prefix, str) else ""
valid_types = frozenset({str, int, float, bytes, bool, tuple, list, dict})
member_list = inspect.getmembers(module_ref, lambda x: not (inspect.isroutine(x) or inspect.ismodule(x) or inspect.isclass(x)))
attribute_list = [mem for mem in member_list if not (mem[0].startswith("__") and mem[0].endswith("__"))]
filtered_list = [mem for mem in attribute_list if type(mem[1]) in valid_types]
for key in dict(filtered_list):
key_to_get = config_prefix + key.upper()
has_key = key_to_get in config_ref
if not has_key:
continue
key_val = getattr(config_ref, key_to_get, None) or dict.get(config_ref, key_to_get, None)
if type(key_val) in valid_types:
setattr(module_ref, key, key_val)

View File

@@ -0,0 +1,14 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
if [[ $# -lt 2 ]]; then
echo "Usage: $0 <username> <password>"
exit 1
fi
USERNAME="$1"
PASSWORD="$2"
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask add-admin ${USERNAME} ${PASSWORD}

View File

@@ -0,0 +1,6 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask print-users

View File

@@ -0,0 +1,6 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask reset-user-passwords

View File

@@ -0,0 +1,14 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
if [[ $# -lt 2 ]]; then
echo "Usage: $0 <username> <password>"
exit 1
fi
USERNAME="$1"
PASSWORD="$2"
export FLASK_APP="pine.backend"
export FLASK_ENV="development"
pipenv run flask set-user-password ${USERNAME} ${PASSWORD}

6
backend/setup_dev_data.sh Executable file
View File

@@ -0,0 +1,6 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
set -x
scripts/data/reset_user_passwords.sh

4
cleanup_dev_stack.sh Executable file
View File

@@ -0,0 +1,4 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
ps ax | grep 'nlp\|ng \|npm\|flask\|virtualenv\|redis\|mongo' | grep -v 'grep\|avahi'

190
docker-compose-mrddocker.yml Executable file
View File

@@ -0,0 +1,190 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
version: "3"
networks:
service:
external: true
private:
internal: true
services:
redis:
container_name: redis
image: nlpwebapp_redis:latest
networks:
- service
volumes:
- ./redis_data:/nlp-web-app/redis/data
- ./redis_logs:/nlp-web-app/redis/log
eve:
container_name: eve
image: nlpwebapp_eve:latest
networks:
- service
volumes:
- ./eve_db:/nlp-web-app/eve/db
- ./eve_logs:/nlp-web-app/eve/logs
dashboard:
container_name: nlp_dashboard
image: johnshopkins-precision-medicine-docker-local.jfrog.io/nlp-dashboard:1.0.1
volumes:
- ./config.yaml:/opt/config.yaml
ports:
- 0.0.0.0:4003:4003
networks:
- service
environment:
SERVER_PORT: 4003
PMAP_MODE: 'yes'
EXTERNAL_URL: https://dev-nlpdashboard.pm.jh.edu
explorer:
container_name: nlp_explorer
image: johnshopkins-precision-medicine-docker-local.jfrog.io/explorer:1.1.0
depends_on:
- redis
- eve
ports:
- 0.0.0.0:4001:4001
networks:
- service
environment:
SERVER_PORT: 4001
REDIS_HOST: redis
REDIS_PORT: 6379
EVE_HOST: eve
EVE_PORT: 7510
PMAP_MODE: 'yes'
EXTERNAL_URL: https://dev-nlpexplorer.pm.jh.edu
matcherui:
container_name: nlp_matcherui
image: johnshopkins-precision-medicine-docker-local.jfrog.io/matcher-ui:1.0.0
depends_on:
- redis
- eve
- matcherservice
ports:
- 0.0.0.0:4000:4000
networks:
- service
environment:
SERVER_PORT: 4000
REDIS_HOST: redis
REDIS_PORT: 6379
EVE_HOST: eve
EVE_PORT: 7510
MATCHER_SERVICE_HOST: matcherservice
MATCHER_SERVICE_PORT: 4002
PMAP_MODE: 'yes'
EXTERNAL_URL: https://dev-nlpmatcher.pm.jh.edu
matcherservice:
container_name: matcherservice
image: johnshopkins-precision-medicine-docker-local.jfrog.io/matcher-service:1.0.0
expose:
- "4002"
depends_on:
- redis
networks:
- service
environment:
REDIS_HOST: redis
REDIS_PORT: 6379
EVE_HOST: eve
EVE_PORT: 7510
MATCHER_SERVICE_PORT: 4002
backend:
container_name: backend
image: nlpwebapp_backend:latest
networks:
- service
depends_on:
- redis
- eve
environment:
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
frontend_annotation:
container_name: nlpannotator
image: nlpwebapp_frontend_annotation:latest
networks:
- service
depends_on:
- backend
open_nlp:
container_name: open_nlp
image: nlpwebapp_open_nlp:latest
networks:
- service
depends_on:
- redis
- eve
- backend
environment:
AL_PIPELINE: opennlp
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "~/models:/pipelines/models/"
links:
- redis
- eve
core_nlp:
container_name: core_nlp
image: nlpwebapp_core_nlp:latest
networks:
- service
depends_on:
- redis
- eve
- backend
environment:
AL_PIPELINE: corenlp
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "~/models:/pipelines/models/"
links:
- redis
- eve
spacy:
container_name: spacy
image: nlpwebapp_spacy:latest
networks:
- service
depends_on:
- redis
- eve
- backend
environment:
AL_PIPELINE: spacy
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "~/models:/pipelines/models/"
links:
- redis
- eve
volumes:
models:
eve_db:
eve_logs:
redis_data:
redis_logs:

View File

@@ -0,0 +1,34 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
version: "3"
services:
backend:
environment:
- VEGAS_CLIENT_SECRET
- EVE_SERVER=http://eve:7510
- REDIS_SERVER=redis
- PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
eve:
build:
args:
- DB_DIR=/nlp-web-app/eve/db
volumes:
- eve_db:/nlp-web-app/eve/db
environment:
- MONGO_URI=
- PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
frontend_annotation:
build:
args:
- SERVER_TYPE=${EXPOSED_SERVER_TYPE}
ports:
- "${EXPOSED_PORT}:443"
environment:
- BACKEND_SERVER=http://backend:${BACKEND_PORT}
- SERVER_NAME=${EXPOSED_SERVER_NAME}
volumes:
eve_db:

27
docker-compose.prod.yml Normal file
View File

@@ -0,0 +1,27 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
version: "3"
services:
eve:
build:
args:
- MONGO_URI=${MONGO_URI}
- DB_DIR=
frontend_annotation:
container_name: ${EXPOSED_SERVER_NAME_PROD}
build:
args:
- SERVER_TYPE=${EXPOSED_SERVER_TYPE_PROD}
environment:
- SERVER_NAME=${EXPOSED_SERVER_NAME_PROD}
# do not add a ports section; exposed ports are managed by the external Nginx instance
networks:
- default
- service # Allow the server to communicate with the Nginx instance
networks:
service:
external: true

136
docker-compose.yml Executable file
View File

@@ -0,0 +1,136 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
version: "3"
services:
redis:
build:
context: ./redis/
args:
- PORT=${REDIS_PORT}
- DATA_DIR=/nlp-web-app/redis/data
- LOG_DIR=/nlp-web-app/logs/redis
volumes:
- redis_data:/nlp-web-app/redis/data
- ${LOGS_VOLUME}:/nlp-web-app/logs
- ${SHARED_VOLUME}/:/nlp-web-app/shared
environment:
- LOG_FORMAT_SNIPPET=/nlp-web-app/shared/logging.redis.conf
# Expose the following to test:
# ports:
# - ${REDIS_PORT}:${REDIS_PORT}
eve:
build:
context: ./eve/
args:
- PORT=${EVE_PORT}
- LOG_DIR=/nlp-web-app/logs/eve
volumes:
- ${SHARED_VOLUME}/:/nlp-web-app/shared
- ${LOGS_VOLUME}:/nlp-web-app/logs
environment:
- PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.json
# Expose the following to test:
# ports:
# - "7510:7510"
backend:
depends_on:
- redis
- eve
build:
context: ./backend/
args:
- PORT=${BACKEND_PORT}
- REDIS_PORT=${REDIS_PORT}
# Load environment variables in service.py using Samuel's ConfigBuilder
volumes:
- ${SHARED_VOLUME}:/nlp-web-app/shared
- ${LOGS_VOLUME}:/nlp-web-app/logs
- ${DOCUMENT_IMAGE_VOLUME}:/nlp-web-app/document_images
environment:
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AUTH_MODULE: ${AUTH_MODULE}
PINE_LOGGING_CONFIG_FILE: /nlp-web-app/shared/logging.python.json
DOCUMENT_IMAGE_DIR: /nlp-web-app/document_images
# Expose the following to test:
# ports:
# - ${BACKEND_PORT}:${BACKEND_PORT}
frontend_annotation:
depends_on:
- backend
build:
context: ./frontend/annotation/
volumes:
- ${SHARED_VOLUME}/:/nlp-web-app/shared
- ${LOGS_VOLUME}:/nlp-web-app/logs
environment:
- LOG_FORMAT_SNIPPET=/nlp-web-app/shared/logging.nginx.conf
open_nlp:
depends_on:
- redis
- eve
- backend
image: al_pipeline
build:
context: ./pipelines/
dockerfile: docker/Dockerfile
environment:
AL_PIPELINE: opennlp
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "${MODELS_VOLUME}:/nlp-web-app/pipelines/models/"
- ${LOGS_VOLUME}:/nlp-web-app/logs
links:
- redis
- eve
core_nlp:
depends_on:
- redis
- eve
- backend
image: al_pipeline
environment:
AL_PIPELINE: corenlp
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "${MODELS_VOLUME}:/nlp-web-app/pipelines/models/"
- ${LOGS_VOLUME}:/nlp-web-app/logs
links:
- redis
- eve
spacy:
depends_on:
- redis
- eve
- backend
image: al_pipeline
environment:
AL_PIPELINE: spacy
AL_REDIS_HOST: redis
AL_REDIS_PORT: ${REDIS_PORT}
AL_EVE_HOST: eve
AL_EVE_PORT: ${EVE_PORT}
volumes:
- "${MODELS_VOLUME}:/nlp-web-app/pipelines/models/"
- ${LOGS_VOLUME}:/nlp-web-app/logs
links:
- redis
- eve
volumes:
redis_data:

3
eve/.dockerignore Normal file
View File

@@ -0,0 +1,3 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
/db/

13
eve/.gitignore vendored Executable file
View File

@@ -0,0 +1,13 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# Eclipse
/.project
/.pydevproject
/.settings/
# Mongo
/logs/
/db/
# Python
**/__pycache__/

57
eve/Dockerfile Executable file
View File

@@ -0,0 +1,57 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
FROM ubuntu:18.04
ARG ROOT_DIR=/nlp-web-app/eve
ARG DB_DIR=/nlp-web-app/eve/db
ARG LOG_DIR=/nlp-web-app/logs/eve
ARG PORT=7510
ARG WORKERS=5
EXPOSE $PORT
# If you want volumes, specify it in docker-compose
ENV DEBIAN_FRONTEND noninteractive
ENV FLASK_PORT $PORT
ENV DB_DIR $DB_DIR
ENV LOG_DIR $LOG_DIR
ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8
RUN mkdir -p $ROOT_DIR $DB_DIR $LOG_DIR
# Install dependencies
RUN apt-get clean && \
apt-get -y update && \
apt-get -y install software-properties-common
RUN apt-get -y update && \
apt-get -y install git build-essential python3.6 python3-pip gettext-base && \
pip3 install --upgrade pip gunicorn pipenv
# Install latest mongodb
RUN if [ -n "${DB_DIR}" ] ; then \
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4 && \
echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.0 multiverse" > /etc/apt/sources.list.d/mongodb-org-4.0.list && \
apt-get -y update && \
apt-get install -y mongodb-org mongodb-org-server mongodb-org-tools mongodb-org-shell; \
fi
# Install python packages
WORKDIR $ROOT_DIR
ADD Pipfile Pipfile.lock ./
RUN pipenv install --system --deploy
# Add eve and code
ADD docker/wsgi.py $ROOT_DIR
ADD docker_run.sh $ROOT_DIR
ADD test/ $ROOT_DIR/test
ADD python/ $ROOT_DIR/python
COPY docker/config.py.template ./
RUN PORT=$PORT WORKERS=$WORKERS envsubst '${PORT} ${WORKERS}' < ./config.py.template > ./config.py
# Start MongoDB and the Eve Service
CMD ["./docker_run.sh"]

21
eve/Pipfile Normal file
View File

@@ -0,0 +1,21 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
scikit-learn = "*"
pymongo = "*"
requests = "*"
numpy = "*"
scipy = "*"
Eve = "*"
Flask-Cors = "*"
python-json-logger = "*"
[dev-packages]
[requires]
python_version = "3.6"

340
eve/Pipfile.lock generated Normal file
View File

@@ -0,0 +1,340 @@
{
"_meta": {
"hash": {
"sha256": "ed36db0fde2871e6e9e57e16dcf6ca76700050521e18eac829de50b8ad5838ec"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.6"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"cerberus": {
"hashes": [
"sha256:302e6694f206dd85cb63f13fd5025b31ab6d38c99c50c6d769f8fa0b0f299589"
],
"version": "==1.3.2"
},
"certifi": {
"hashes": [
"sha256:1d987a998c75633c40847cc966fcf5904906c920a7f17ef374f5aa4282abd304",
"sha256:51fcb31174be6e6664c5f69e3e1691a2d72a1a12e90f872cbdb1567eb47b6519"
],
"version": "==2020.4.5.1"
},
"chardet": {
"hashes": [
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
],
"version": "==3.0.4"
},
"click": {
"hashes": [
"sha256:8a18b4ea89d8820c5d0c7da8a64b2c324b4dabb695804dbfea19b9be9d88c0cc",
"sha256:e345d143d80bf5ee7534056164e5e112ea5e22716bbb1ce727941f4c8b471b9a"
],
"version": "==7.1.1"
},
"eve": {
"hashes": [
"sha256:51cc738218b54042a156e5b513c5ac458923aa5549662c1c2664a3d80582dcc3",
"sha256:7b01cc9ca20984429471aa384ca6019f55f175aa60d857bfc392977b4e3ff1a8"
],
"index": "pypi",
"version": "==1.1"
},
"events": {
"hashes": [
"sha256:f4d9c41a5c160ce504278f219fe56f44242ca63794a0ad638b52d1e087ac2a41"
],
"version": "==0.3"
},
"flask": {
"hashes": [
"sha256:4efa1ae2d7c9865af48986de8aeb8504bf32c7f3d6fdc9353d34b21f4b127060",
"sha256:8a4fdd8936eba2512e9c85df320a37e694c93945b33ef33c89946a340a238557"
],
"version": "==1.1.2"
},
"flask-cors": {
"hashes": [
"sha256:72170423eb4612f0847318afff8c247b38bd516b7737adfc10d1c2cdbb382d16",
"sha256:f4d97201660e6bbcff2d89d082b5b6d31abee04b1b3003ee073a6fd25ad1d69a"
],
"index": "pypi",
"version": "==3.0.8"
},
"idna": {
"hashes": [
"sha256:7588d1c14ae4c77d74036e8c22ff447b26d0fde8f007354fd48a7814db15b7cb",
"sha256:a068a21ceac8a4d63dbfd964670474107f541babbd2250d61922f029858365fa"
],
"version": "==2.9"
},
"itsdangerous": {
"hashes": [
"sha256:321b033d07f2a4136d3ec762eac9f16a10ccd60f53c0c91af90217ace7ba1f19",
"sha256:b12271b2047cb23eeb98c8b5622e2e5c5e9abd9784a153e9d8ef9cb4dd09d749"
],
"version": "==1.1.0"
},
"jinja2": {
"hashes": [
"sha256:93187ffbc7808079673ef52771baa950426fd664d3aad1d0fa3e95644360e250",
"sha256:b0eaf100007721b5c16c1fc1eecb87409464edc10469ddc9a22a27a99123be49"
],
"version": "==2.11.1"
},
"joblib": {
"hashes": [
"sha256:0630eea4f5664c463f23fbf5dcfc54a2bc6168902719fa8e19daf033022786c8",
"sha256:bdb4fd9b72915ffb49fde2229ce482dd7ae79d842ed8c2b4c932441495af1403"
],
"version": "==0.14.1"
},
"markupsafe": {
"hashes": [
"sha256:00bc623926325b26bb9605ae9eae8a215691f33cae5df11ca5424f06f2d1f473",
"sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161",
"sha256:09c4b7f37d6c648cb13f9230d847adf22f8171b1ccc4d5682398e77f40309235",
"sha256:1027c282dad077d0bae18be6794e6b6b8c91d58ed8a8d89a89d59693b9131db5",
"sha256:13d3144e1e340870b25e7b10b98d779608c02016d5184cfb9927a9f10c689f42",
"sha256:24982cc2533820871eba85ba648cd53d8623687ff11cbb805be4ff7b4c971aff",
"sha256:29872e92839765e546828bb7754a68c418d927cd064fd4708fab9fe9c8bb116b",
"sha256:43a55c2930bbc139570ac2452adf3d70cdbb3cfe5912c71cdce1c2c6bbd9c5d1",
"sha256:46c99d2de99945ec5cb54f23c8cd5689f6d7177305ebff350a58ce5f8de1669e",
"sha256:500d4957e52ddc3351cabf489e79c91c17f6e0899158447047588650b5e69183",
"sha256:535f6fc4d397c1563d08b88e485c3496cf5784e927af890fb3c3aac7f933ec66",
"sha256:596510de112c685489095da617b5bcbbac7dd6384aeebeda4df6025d0256a81b",
"sha256:62fe6c95e3ec8a7fad637b7f3d372c15ec1caa01ab47926cfdf7a75b40e0eac1",
"sha256:6788b695d50a51edb699cb55e35487e430fa21f1ed838122d722e0ff0ac5ba15",
"sha256:6dd73240d2af64df90aa7c4e7481e23825ea70af4b4922f8ede5b9e35f78a3b1",
"sha256:717ba8fe3ae9cc0006d7c451f0bb265ee07739daf76355d06366154ee68d221e",
"sha256:79855e1c5b8da654cf486b830bd42c06e8780cea587384cf6545b7d9ac013a0b",
"sha256:7c1699dfe0cf8ff607dbdcc1e9b9af1755371f92a68f706051cc8c37d447c905",
"sha256:88e5fcfb52ee7b911e8bb6d6aa2fd21fbecc674eadd44118a9cc3863f938e735",
"sha256:8defac2f2ccd6805ebf65f5eeb132adcf2ab57aa11fdf4c0dd5169a004710e7d",
"sha256:98c7086708b163d425c67c7a91bad6e466bb99d797aa64f965e9d25c12111a5e",
"sha256:9add70b36c5666a2ed02b43b335fe19002ee5235efd4b8a89bfcf9005bebac0d",
"sha256:9bf40443012702a1d2070043cb6291650a0841ece432556f784f004937f0f32c",
"sha256:ade5e387d2ad0d7ebf59146cc00c8044acbd863725f887353a10df825fc8ae21",
"sha256:b00c1de48212e4cc9603895652c5c410df699856a2853135b3967591e4beebc2",
"sha256:b1282f8c00509d99fef04d8ba936b156d419be841854fe901d8ae224c59f0be5",
"sha256:b2051432115498d3562c084a49bba65d97cf251f5a331c64a12ee7e04dacc51b",
"sha256:ba59edeaa2fc6114428f1637ffff42da1e311e29382d81b339c1817d37ec93c6",
"sha256:c8716a48d94b06bb3b2524c2b77e055fb313aeb4ea620c8dd03a105574ba704f",
"sha256:cd5df75523866410809ca100dc9681e301e3c27567cf498077e8551b6d20e42f",
"sha256:cdb132fc825c38e1aeec2c8aa9338310d29d337bebbd7baa06889d09a60a1fa2",
"sha256:e249096428b3ae81b08327a63a485ad0878de3fb939049038579ac0ef61e17e7",
"sha256:e8313f01ba26fbbe36c7be1966a7b7424942f670f38e666995b88d012765b9be"
],
"version": "==1.1.1"
},
"numpy": {
"hashes": [
"sha256:1598a6de323508cfeed6b7cd6c4efb43324f4692e20d1f76e1feec7f59013448",
"sha256:1b0ece94018ae21163d1f651b527156e1f03943b986188dd81bc7e066eae9d1c",
"sha256:2e40be731ad618cb4974d5ba60d373cdf4f1b8dcbf1dcf4d9dff5e212baf69c5",
"sha256:4ba59db1fcc27ea31368af524dcf874d9277f21fd2e1f7f1e2e0c75ee61419ed",
"sha256:59ca9c6592da581a03d42cc4e270732552243dc45e87248aa8d636d53812f6a5",
"sha256:5e0feb76849ca3e83dd396254e47c7dba65b3fa9ed3df67c2556293ae3e16de3",
"sha256:6d205249a0293e62bbb3898c4c2e1ff8a22f98375a34775a259a0523111a8f6c",
"sha256:6fcc5a3990e269f86d388f165a089259893851437b904f422d301cdce4ff25c8",
"sha256:82847f2765835c8e5308f136bc34018d09b49037ec23ecc42b246424c767056b",
"sha256:87902e5c03355335fc5992a74ba0247a70d937f326d852fc613b7f53516c0963",
"sha256:9ab21d1cb156a620d3999dd92f7d1c86824c622873841d6b080ca5495fa10fef",
"sha256:a1baa1dc8ecd88fb2d2a651671a84b9938461e8a8eed13e2f0a812a94084d1fa",
"sha256:a244f7af80dacf21054386539699ce29bcc64796ed9850c99a34b41305630286",
"sha256:a35af656a7ba1d3decdd4fae5322b87277de8ac98b7d9da657d9e212ece76a61",
"sha256:b1fe1a6f3a6f355f6c29789b5927f8bd4f134a4bd9a781099a7c4f66af8850f5",
"sha256:b5ad0adb51b2dee7d0ee75a69e9871e2ddfb061c73ea8bc439376298141f77f5",
"sha256:ba3c7a2814ec8a176bb71f91478293d633c08582119e713a0c5351c0f77698da",
"sha256:cd77d58fb2acf57c1d1ee2835567cd70e6f1835e32090538f17f8a3a99e5e34b",
"sha256:cdb3a70285e8220875e4d2bc394e49b4988bdb1298ffa4e0bd81b2f613be397c",
"sha256:deb529c40c3f1e38d53d5ae6cd077c21f1d49e13afc7936f7f868455e16b64a0",
"sha256:e7894793e6e8540dbeac77c87b489e331947813511108ae097f1715c018b8f3d"
],
"index": "pypi",
"version": "==1.18.2"
},
"pymongo": {
"hashes": [
"sha256:01b4e10027aef5bb9ecefbc26f5df3368ce34aef81df43850f701e716e3fe16d",
"sha256:0fc5aa1b1acf7f61af46fe0414e6a4d0c234b339db4c03a63da48599acf1cbfc",
"sha256:1396eb7151e0558b1f817e4b9d7697d5599e5c40d839a9f7270bd90af994ad82",
"sha256:18e84a3ec5e73adcb4187b8e5541b2ad61d716026ed9863267e650300d8bea33",
"sha256:19adf2848b80cb349b9891cc854581bbf24c338be9a3260e73159bdeb2264464",
"sha256:20ee0475aa2ba437b0a14806f125d696f90a8433d820fb558fdd6f052acde103",
"sha256:26798795097bdeb571f13942beef7e0b60125397811c75b7aa9214d89880dd1d",
"sha256:26e707a4eb851ec27bb969b5f1413b9b2eac28fe34271fa72329100317ea7c73",
"sha256:2a3c7ad01553b27ec553688a1e6445e7f40355fb37d925c11fcb50b504e367f8",
"sha256:2f07b27dbf303ea53f4147a7922ce91a26b34a0011131471d8aaf73151fdee9a",
"sha256:316f0cf543013d0c085e15a2c8abe0db70f93c9722c0f99b6f3318ff69477d70",
"sha256:31d11a600eea0c60de22c8bdcb58cda63c762891facdcb74248c36713240987f",
"sha256:334ef3ffd0df87ea83a0054454336159f8ad9c1b389e19c0032d9cb8410660e6",
"sha256:358ba4693c01022d507b96a980ded855a32dbdccc3c9331d0667be5e967f30ed",
"sha256:3a6568bc53103df260f5c7d2da36dffc5202b9a36c85540bba1836a774943794",
"sha256:444bf2f44264578c4085bb04493bfed0e5c1b4fe7c2704504d769f955cc78fe4",
"sha256:47a00b22c52ee59dffc2aad02d0bbfb20c26ec5b8de8900492bf13ad6901cf35",
"sha256:4c067db43b331fc709080d441cb2e157114fec60749667d12186cc3fc8e7a951",
"sha256:4c092310f804a5d45a1bcaa4191d6d016c457b6ed3982a622c35f729ff1c7f6b",
"sha256:53b711b33134e292ef8499835a3df10909c58df53a2a0308f598c432e9a62892",
"sha256:568d6bee70652d8a5af1cd3eec48b4ca1696fb1773b80719ebbd2925b72cb8f6",
"sha256:56fa55032782b7f8e0bf6956420d11e2d4e9860598dfe9c504edec53af0fc372",
"sha256:5a2c492680c61b440272341294172fa3b3751797b1ab983533a770e4fb0a67ac",
"sha256:61235cc39b5b2f593086d1d38f3fc130b2d125bd8fc8621d35bc5b6bdeb92bd2",
"sha256:619ac9aaf681434b4d4718d1b31aa2f0fce64f2b3f8435688fcbdc0c818b6c54",
"sha256:6238ac1f483494011abde5286282afdfacd8926659e222ba9b74c67008d3a58c",
"sha256:63752a72ca4d4e1386278bd43d14232f51718b409e7ac86bcf8810826b531113",
"sha256:6fdc5ccb43864065d40dd838437952e9e3da9821b7eac605ba46ada77f846bdf",
"sha256:7abc3a6825a346fa4621a6f63e3b662bbb9e0f6ffc32d30a459d695f20fb1a8b",
"sha256:7aef381bb9ae8a3821abd7f9d4d93978dbd99072b48522e181baeffcd95b56ae",
"sha256:80df3caf251fe61a3f0c9614adc6e2bfcffd1cd3345280896766712fb4b4d6d7",
"sha256:95f970f34b59987dee6f360d2e7d30e181d58957b85dff929eee4423739bd151",
"sha256:993257f6ca3cde55332af1f62af3e04ca89ce63c08b56a387cdd46136c72f2fa",
"sha256:9c0a57390549affc2b5dda24a38de03a5c7cbc58750cd161ff5d106c3c6eec80",
"sha256:a0794e987d55d2f719cc95fcf980fc62d12b80e287e6a761c4be14c60bd9fecc",
"sha256:a3b98121e68bf370dd8ea09df67e916f93ea95b52fc010902312168c4d1aff5d",
"sha256:a60756d55f0887023b3899e6c2923ba5f0042fb11b1d17810b4e07395404f33e",
"sha256:a676bd2fbc2309092b9bbb0083d35718b5420af3a42135ebb1e4c3633f56604d",
"sha256:a732838c78554c1257ff2492f5c8c4c7312d0aecd7f732149e255f3749edd5ee",
"sha256:ae65d65fde4135ef423a2608587c9ef585a3551fc2e4e431e7c7e527047581be",
"sha256:b070a4f064a9edb70f921bfdc270725cff7a78c22036dd37a767c51393fb956f",
"sha256:b6da85949aa91e9f8c521681344bd2e163de894a5492337fba8b05c409225a4f",
"sha256:bbf47110765b2a999803a7de457567389253f8670f7daafb98e059c899ce9764",
"sha256:c06b3f998d2d7160db58db69adfb807d2ec307e883e2f17f6b87a1ef6c723f11",
"sha256:c318fb70542be16d3d4063cde6010b1e4d328993a793529c15a619251f517c39",
"sha256:c4aef42e5fa4c9d5a99f751fb79caa880dac7eaf8a65121549318b984676a1b7",
"sha256:c9ca545e93a9c2a3bdaa2e6e21f7a43267ff0813e8055adf2b591c13164c0c57",
"sha256:da2c3220eb55c4239dd8b982e213da0b79023cac59fe54ca09365f2bc7e4ad32",
"sha256:dd8055da300535eefd446b30995c0813cc4394873c9509323762a93e97c04c03",
"sha256:e2b46e092ea54b732d98c476720386ff2ccd126de1e52076b470b117bff7e409",
"sha256:e334c4f39a2863a239d38b5829e442a87f241a92da9941861ee6ec5d6380b7fe",
"sha256:e5c54f04ca42bbb5153aec5d4f2e3d9f81e316945220ac318abd4083308143f5",
"sha256:f96333f9d2517c752c20a35ff95de5fc2763ac8cdb1653df0f6f45d281620606"
],
"index": "pypi",
"version": "==3.10.1"
},
"python-json-logger": {
"hashes": [
"sha256:b7a31162f2a01965a5efb94453ce69230ed208468b0bbc7fdfc56e6d8df2e281"
],
"index": "pypi",
"version": "==0.1.11"
},
"requests": {
"hashes": [
"sha256:43999036bfa82904b6af1d99e4882b560e5e2c68e5c4b0aa03b655f3d7d73fee",
"sha256:b3f43d496c6daba4493e7c431722aeb7dbc6288f52a6e04e7b6023b0247817e6"
],
"index": "pypi",
"version": "==2.23.0"
},
"scikit-learn": {
"hashes": [
"sha256:1bf45e62799b6938357cfce19f72e3751448c4b27010e4f98553da669b5bbd86",
"sha256:267ad874b54c67b479c3b45eb132ef4a56ab2b27963410624a413a4e2a3fc388",
"sha256:2d1bb83d6c51a81193d8a6b5f31930e2959c0e1019d49bdd03f54163735dae4b",
"sha256:349ba3d837fb3f7cb2b91486c43713e4b7de17f9e852f165049b1b7ac2f81478",
"sha256:3f4d8eea3531d3eaf613fa33f711113dfff6021d57a49c9d319af4afb46f72f0",
"sha256:4990f0e166292d2a0f0ee528233723bcfd238bfdb3ec2512a9e27f5695362f35",
"sha256:57538d138ba54407d21e27c306735cbd42a6aae0df6a5a30c7a6edde46b0017d",
"sha256:5b722e8bb708f254af028dc2da86d23df5371cba57e24f889b672e7b15423caa",
"sha256:6043e2c4ccfc68328c331b0fc19691be8fb02bd76d694704843a23ad651de902",
"sha256:672ea38eb59b739a8907ec063642b486bcb5a2073dda5b72b7983eeaf1fd67c1",
"sha256:73207dca6e70f8f611f28add185cf3a793c8232a1722f21d82259560dc35cd50",
"sha256:83fc104a799cb340054e485c25dfeee712b36f5638fb374eba45a9db490f16ff",
"sha256:8416150ab505f1813da02cdbdd9f367b05bfc75cf251235015bb09f8674358a0",
"sha256:84e759a766c315deb5c85139ff879edbb0aabcddb9358acf499564ed1c21e337",
"sha256:8ed66ab27b3d68e57bb1f315fc35e595a5c4a1f108c3420943de4d18fc40e615",
"sha256:a7f8aa93f61aaad080b29a9018db93ded0586692c03ddf2122e47dd1d3a14e1b",
"sha256:ddd3bf82977908ff69303115dd5697606e669d8a7eafd7d83bb153ef9e11bd5e",
"sha256:de9933297f8659ee3bb330eafdd80d74cd73d5dab39a9026b65a4156bc479063",
"sha256:ea91a70a992ada395efc3d510cf011dc2d99dc9037bb38cd1cb00e14745005f5",
"sha256:eb4c9f0019abb374a2e55150f070a333c8f990b850d1eb4dfc2765fc317ffc7c",
"sha256:ffce8abfdcd459e72e5b91727b247b401b22253cbd18d251f842a60e26262d6f"
],
"index": "pypi",
"version": "==0.22.2.post1"
},
"scipy": {
"hashes": [
"sha256:00af72998a46c25bdb5824d2b729e7dabec0c765f9deb0b504f928591f5ff9d4",
"sha256:0902a620a381f101e184a958459b36d3ee50f5effd186db76e131cbefcbb96f7",
"sha256:1e3190466d669d658233e8a583b854f6386dd62d655539b77b3fa25bfb2abb70",
"sha256:2cce3f9847a1a51019e8c5b47620da93950e58ebc611f13e0d11f4980ca5fecb",
"sha256:3092857f36b690a321a662fe5496cb816a7f4eecd875e1d36793d92d3f884073",
"sha256:386086e2972ed2db17cebf88610aab7d7f6e2c0ca30042dc9a89cf18dcc363fa",
"sha256:71eb180f22c49066f25d6df16f8709f215723317cc951d99e54dc88020ea57be",
"sha256:770254a280d741dd3436919d47e35712fb081a6ff8bafc0f319382b954b77802",
"sha256:787cc50cab3020a865640aba3485e9fbd161d4d3b0d03a967df1a2881320512d",
"sha256:8a07760d5c7f3a92e440ad3aedcc98891e915ce857664282ae3c0220f3301eb6",
"sha256:8d3bc3993b8e4be7eade6dcc6fd59a412d96d3a33fa42b0fa45dc9e24495ede9",
"sha256:9508a7c628a165c2c835f2497837bf6ac80eb25291055f56c129df3c943cbaf8",
"sha256:a144811318853a23d32a07bc7fd5561ff0cac5da643d96ed94a4ffe967d89672",
"sha256:a1aae70d52d0b074d8121333bc807a485f9f1e6a69742010b33780df2e60cfe0",
"sha256:a2d6df9eb074af7f08866598e4ef068a2b310d98f87dc23bd1b90ec7bdcec802",
"sha256:bb517872058a1f087c4528e7429b4a44533a902644987e7b2fe35ecc223bc408",
"sha256:c5cac0c0387272ee0e789e94a570ac51deb01c796b37fb2aad1fb13f85e2f97d",
"sha256:cc971a82ea1170e677443108703a2ec9ff0f70752258d0e9f5433d00dda01f59",
"sha256:dba8306f6da99e37ea08c08fef6e274b5bf8567bb094d1dbe86a20e532aca088",
"sha256:dc60bb302f48acf6da8ca4444cfa17d52c63c5415302a9ee77b3b21618090521",
"sha256:dee1bbf3a6c8f73b6b218cb28eed8dd13347ea2f87d572ce19b289d6fd3fbc59"
],
"index": "pypi",
"version": "==1.4.1"
},
"simplejson": {
"hashes": [
"sha256:0fe3994207485efb63d8f10a833ff31236ed27e3b23dadd0bf51c9900313f8f2",
"sha256:17163e643dbf125bb552de17c826b0161c68c970335d270e174363d19e7ea882",
"sha256:1d1e929cdd15151f3c0b2efe953b3281b2fd5ad5f234f77aca725f28486466f6",
"sha256:1ea59f570b9d4916ae5540a9181f9c978e16863383738b69a70363bc5e63c4cb",
"sha256:22a7acb81968a7c64eba7526af2cf566e7e2ded1cb5c83f0906b17ff1540f866",
"sha256:2b4b2b738b3b99819a17feaf118265d0753d5536049ea570b3c43b51c4701e81",
"sha256:4cf91aab51b02b3327c9d51897960c554f00891f9b31abd8a2f50fd4a0071ce8",
"sha256:7cce4bac7e0d66f3a080b80212c2238e063211fe327f98d764c6acbc214497fc",
"sha256:8027bd5f1e633eb61b8239994e6fc3aba0346e76294beac22a892eb8faa92ba1",
"sha256:86afc5b5cbd42d706efd33f280fec7bd7e2772ef54e3f34cf6b30777cd19a614",
"sha256:87d349517b572964350cc1adc5a31b493bbcee284505e81637d0174b2758ba17",
"sha256:926bcbef9eb60e798eabda9cd0bbcb0fca70d2779aa0aa56845749d973eb7ad5",
"sha256:9a126c3a91df5b1403e965ba63b304a50b53d8efc908a8c71545ed72535374a3",
"sha256:daaf4d11db982791be74b23ff4729af2c7da79316de0bebf880fa2d60bcc8c5a",
"sha256:fc046afda0ed8f5295212068266c92991ab1f4a50c6a7144b69364bdee4a0159",
"sha256:fc9051d249dd5512e541f20330a74592f7a65b2d62e18122ca89bf71f94db748"
],
"version": "==3.17.0"
},
"six": {
"hashes": [
"sha256:236bdbdce46e6e6a3d61a337c0f8b763ca1e8717c03b369e87a7ec7ce1319c0a",
"sha256:8f3cd2e254d8f793e7f3d6d9df77b92252b52637291d0f0da013c76ea2724b6c"
],
"version": "==1.14.0"
},
"urllib3": {
"hashes": [
"sha256:2f3db8b19923a873b3e5256dc9c2dedfa883e33d87c690d9c7913e1f40673cdc",
"sha256:87716c2d2a7121198ebcb7ce7cccf6ce5e9ba539041cfbaeecfb641dc0bf6acc"
],
"version": "==1.25.8"
},
"werkzeug": {
"hashes": [
"sha256:2de2a5db0baeae7b2d2664949077c2ac63fbd16d98da0ff71837f7d1dea3fd43",
"sha256:6c80b1e5ad3665290ea39320b91e1be1e0d5f60652b964a3070216de83d2e47c"
],
"version": "==1.0.1"
}
},
"develop": {}
}

26
eve/README.md Executable file
View File

@@ -0,0 +1,26 @@
## Development Environment
Requirements:
* Mongo >=4
* Python 3, Pip, and Pipenv
With pipenv, doing a `pipenv install --dev` will install the python
dependencies in a virtual environment.
After this you can run `dev_run.sh` to run the dev environment.
Initially the system starts with an empty database; run
`setup_dev_data.sh` to set up initial testing data.
This script contains identifiers for the valid users, pipeline, collections,
documents and annotations in the database. If you run this script it will
generate new data but the identifiers will change. They will be printed
at the end of script.
Note that this setup script adds users but does NOT set their passwords.
You should use the setup script in the backend directory to do that. Run
`set_user_password.sh` with their email, not username.
## Production Environment
This service can also be run as a Docker container.
It should be run using docker-compose at the top level (../).

31
eve/dev_run.sh Executable file
View File

@@ -0,0 +1,31 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
DIR="$( cd "$( dirname "${0}" )" && pwd )"
DATA_DIR=${DATA_DIR:-${DIR}}
LOG_DIR=${LOG_DIR:-${DIR}/logs}
DB_DIR=${DATA_DIR}/db
set -ex
mkdir -p ${DB_DIR} ${LOG_DIR}
# use a port separate from the system-wide mongo port, if it's running as a service
if [[ -z ${MONGO_PORT} ]]; then
MONGO_PORT="27018"
fi
export MONGO_PORT
mkdir -p logs/ db/
mongod --dbpath ${DB_DIR} \
--port ${MONGO_PORT} \
--logpath ${LOG_DIR}/mongod.log \
--logRotate reopen --logappend &
if [[ -z ${FLASK_PORT} ]]; then
FLASK_PORT="5001"
fi
export FLASK_PORT
export FLASK_ENV="development"
export PYTHONPATH="${DIR}/python"
pipenv run python3 ${DIR}/python/EveDataLayer.py

View File

@@ -0,0 +1,13 @@
import logging.config
import os
import json
bind = "0.0.0.0:${PORT}"
workers = ${WORKERS}
accesslog = "-"
if "PINE_LOGGING_CONFIG_FILE" in os.environ and os.path.isfile(os.environ["PINE_LOGGING_CONFIG_FILE"]):
with open(os.environ["PINE_LOGGING_CONFIG_FILE"], "r") as f:
c = json.load(f)
c["disable_existing_loggers"] = True
logconfig_dict = c

5
eve/docker/wsgi.py Normal file
View File

@@ -0,0 +1,5 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from python.EveDataLayer import create_app
app = create_app()

28
eve/docker_run.sh Executable file
View File

@@ -0,0 +1,28 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
DIR="$( cd "$( dirname "${0}" )" && pwd )"
set -e
if [[ -z ${MONGO_URI} ]] && [[ -z ${DB_DIR} ]]; then
echo "Please set MONGO_URI or DB_DIR in docker configuration."
exit 1
fi
if [[ -n ${MONGO_URI} ]]; then
export MONGO_URI
fi
if [[ -n ${FLASK_PORT} ]]; then
export FLASK_PORT
fi
mkdir -p ${LOG_DIR}
if [[ -z ${MONGO_URI} ]]; then
# only run if MONGO_URI is not set
mongod --dbpath ${DB_DIR} \
--logpath ${LOG_DIR}/mongod.log \
--logRotate reopen --logappend &
fi
/usr/local/bin/gunicorn --config config.py --pythonpath ${DIR}/python wsgi:app

103
eve/python/EveDataLayer.py Executable file
View File

@@ -0,0 +1,103 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import logging.config
import json
import xml.etree.ElementTree as ET
import os
import subprocess
import tempfile
from eve import Eve
from flask import jsonify, request, send_file
from flask_cors import CORS
from werkzeug import exceptions
from Jhed import JHED, JHEDEncoder, JHEDValidator
def post_documents_get_callback(request, payload):
if "truncate" in request.args:
truncate = int(request.args.get("truncate", 50))
if payload.is_json:
data = payload.get_json()
if "text" in data:
data["text"] = data["text"][0 : truncate]
elif "_items" in data:
for item in data["_items"]:
if "text" in item:
item["text"] = item["text"][0 : truncate]
payload.data = json.dumps(data)
elif payload.data:
data = ET.fromstring(payload.data)
text = data.find("text")
if text != None:
text.text = text.text[0 : truncate]
else:
for child in data.findall("resource"):
text = child.find("text")
if text != None and text.text != None:
text.text = text.text[0 : truncate]
payload.data = ET.tostring(data)
def setup_logging():
if "PINE_LOGGING_CONFIG_FILE" in os.environ and os.path.isfile(os.environ["PINE_LOGGING_CONFIG_FILE"]):
with open(os.environ["PINE_LOGGING_CONFIG_FILE"], "r") as f:
logging.config.dictConfig(json.load(f))
logging.getLogger(__name__).info("Set logging configuration from file {}".format(os.environ["PINE_LOGGING_CONFIG_FILE"]))
def create_app():
setup_logging()
#app = Eve()
app = Eve(json_encoder=JHEDEncoder, validator=JHEDValidator)
app.on_post_GET_documents += post_documents_get_callback
@app.route("/system/export", methods = ["GET"])
def system_export():
db = app.data.driver.db
(f, filename) = tempfile.mkstemp()
os.close(f)
cmd = ["mongodump", "--host", db.client.address[0], "--port", str(db.client.address[1]),
"--gzip", "--archive={}".format(filename)]
print("RUNNING: {}".format(cmd))
try:
output = subprocess.check_output(cmd)
print(output)
return send_file(filename, as_attachment = True, attachment_filename = "dump.gz",
mimetype = "application/gzip")
finally:
os.remove(filename)
@app.route("/system/import", methods = ["PUT", "POST"])
def system_import():
db = app.data.driver.db
dump_first = request.method.upper() == "POST"
if not "file" in request.files:
raise exceptions.UnprocessableEntity("Missing 'file' parameter")
(f, filename) = tempfile.mkstemp()
os.close(f)
try:
request.files["file"].save(filename)
cmd = ["mongorestore", "--host", db.client.address[0], "--port", str(db.client.address[1]),
"--gzip", "--archive={}".format(filename)]
if dump_first:
cmd.append("--drop")
print("RUNNING: {}".format(cmd))
output = subprocess.check_output(cmd)
print(output)
return jsonify({
"success": True
})
except Exception as e:
print(e)
raise exceptions.BadRequest("Error parsing input:" + str(e))
finally:
os.remove(filename)
return app
if __name__ == '__main__':
app = create_app()
CORS(app, supports_credentials=True)
FLASK_PORT = int(os.environ.get("FLASK_PORT", 5000))
app.run(host="0.0.0.0", port = FLASK_PORT)

48
eve/python/Jhed.py Normal file
View File

@@ -0,0 +1,48 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
from eve.io.mongo import Validator
from eve.io.base import BaseJSONEncoder
from bson import ObjectId
class JHED(object):
def __init__(self, jhed):
self.jhed = jhed.lower()
def __str__(self):
return self.jhed
def __hash__(self):
return hash(self.jhed)
def __eq__(self, other):
if isinstance(other, JHED):
return self.jhed == other.jhed
return NotImplemented
class JHEDValidator(Validator):
"""
Extends the base mongo validator adding support for the uuid data-type
"""
def _validate_type_jhed(self, value):
try:
JHED(value)
except ValueError:
pass
return True
class JHEDEncoder(BaseJSONEncoder):
""" JSONEconder subclass used by the json render function.
This is different from BaseJSONEoncoder since it also addresses
encoding of UUID
"""
def default(self, obj):
if isinstance(obj, JHED):
return str(obj)
elif isinstance(obj, ObjectId):
return str(obj)
else:
# delegate rendering to base class method (the base class
# will properly render ObjectIds, datetimes, etc.)
return super(JHEDEncoder, self).default(obj)

1
eve/python/__init__.py Normal file
View File

@@ -0,0 +1 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

186
eve/python/settings.py Executable file
View File

@@ -0,0 +1,186 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import logging
import os
LOGGER = logging.getLogger("pine.eve." + __name__)
collections = {
'schema': {
'creator_id': {'type': 'jhed', 'required': True},
'annotators': {'type': 'list', 'schema':{'type':'string'}},
'viewers': {'type': 'list', 'schema':{'type':'string'}},
'labels': {'type': 'list','required': True},
'metadata': {'type': 'dict'},
'archived': {'type': 'boolean'},
'configuration': {'type': 'dict'}
},
'item_methods': ['GET', 'PUT', 'PATCH']
}
documents = {
'schema': {
'creator_id': {'type': 'jhed', 'required': True},
'collection_id': {'type': 'objectid', 'required': True},
'overlap': {'type': 'integer'},
'text': {'type': 'string'},
'uri': {'type':'string'},
'metadata': {'type': 'dict'},
"has_annotated" : {'type' : 'dict'}
},
'item_methods': ['GET', 'PUT', 'PATCH'],
'mongo_indexes':{'doc_creator_id': [('creator_id', 1)], 'doc_collection_id':[('collection_id', 1)]}
}
iaa_reports = {
'schema' :{
'collection_id' : { 'type' : 'objectid' },
'num_of_annotators' : {'type' : 'integer'},
'num_of_agreement_docs': {'type' : 'integer'},
'num_of_labels' : { 'type' : 'integer'},
'per_doc_agreement' : { 'type' : 'list'},
'per_label_agreement': {'type': 'list'},
'overall_agreement': {'type': 'dict'},
'labels_per_annotator': {'type': 'dict'},
},
'item_methods': ['GET', 'PUT', 'PATCH'],
'versioning': True
}
annotations = {
'schema': {
'creator_id': {'type': 'jhed', 'required': True},
'collection_id': {'type': 'objectid', 'required': True},
'document_id': {'type': 'objectid', 'required': True},
'annotation': {'type': 'list'}
},
'mongo_indexes':{'ann_creator_id': [('creator_id', 1)], 'ann_collection_id':[('collection_id', 1)], 'ann_document_id':[('document_id',1)]},
'item_methods':['GET', 'PUT', 'PATCH'],
'versioning':True
}
pipelines = {
'schema': {
'_id':{'type':'objectid', 'required':True},
'title': {'type': 'string', 'required': True},
'name': {'type': 'string', 'required': True},
'description': {'type': 'string'},
'parameters': {'type': 'dict'}
}
}
users = {
'schema': {
'_id':{'type': 'jhed', 'required': True},
'firstname': {'type': 'string', 'required': True},
'lastname': {'type': 'string', 'required': True},
'email': {'type': 'string'},
'description': {'type': 'string'},
'role': {
'type': 'list',
'allowed': ['administrator', 'user'],
},
'passwdhash': {'type': 'string'}
},
'item_url': 'regex("[a-z]{4,9}[0-9]{1,4}")',
#'item_lookup_field': '_id', # Name of object field ex. mongo object id here
'query_objectid_as_string': True,
'item_methods':['GET', 'PUT', 'DELETE']
}
classifiers = {
'schema': {
'collection_id': {'type': 'objectid', 'required': True},
'overlap':{'type':'float', 'required':True},
'pipeline_id':{'type':'objectid', 'required':True},
'parameters':{'type':'dict'},
'labels':{'type':'list', 'required':True},
'filename':{'type':'string'},
'train_every': {'type': 'integer', 'default': 100},
'annotated_document_count': {'type': 'integer', 'default': 0}
},
'mongo_indexes':{'class_pipeline_id': [('pipeline_id', 1)], 'class_collection_id':[('collection_id', 1)]},
'item_methods':['GET', 'PUT', 'PATCH'],
'versioning':True
}
metrics = {
'schema': {
'collection_id': {'type': 'objectid', 'required': True},
'classifier_id': {'type': 'objectid', 'required': True},
'classifier_db_version': {'type': 'integer'},
'documents': {'type': 'list', 'required': True},
'annotations': {'type': 'list', 'required': True},
'folds': {'type': 'list'},
'metrics': {'type': 'list'},
'metric_averages' : {'type': 'dict'},
'filename': {'type': 'string'},
'trained_classifier_db_version': {'type': 'integer'}
},
'mongo_indexes':{'metrics_classifier_id': [('classifier_id', 1)], 'doc_collection_id':[('collection_id', 1)]},
'item_methods':['GET', 'PUT', 'PATCH'],
'versioning':True
}
next_instances = {
'schema':{
'classifier_id':{'type':'objectid', 'required':True},
'document_ids':{'type':'list', 'required':True},
'overlap_document_ids':{'type':'dict', 'required':True}
},
'mongo_indexes':{'next_classifier_id': [('classifier_id', 1)]},
'item_methods':['GET', 'PUT', 'PATCH']
}
parsed = {
'schema': {
'collection_id': {'type': 'objectid', 'required': True},
'_id':{'type':'objectid', 'required':True},
'text': {'type': 'string'},
'spacy':{'type':'string'}
},
'mongo_indexes':{'doc_collection_id':[('collection_id', 1)]},
'item_methods':['GET', 'PUT', 'PATCH']
}
DOMAIN={
'collections':collections,
'documents':documents,
'annotations':annotations,
'classifiers':classifiers,
'metrics':metrics,
'users':users,
'pipelines':pipelines,
'next_instances':next_instances,
'parsed':parsed,
'iaa_reports' : iaa_reports
}
if os.environ.get("MONGO_URI"):
MONGO_URI = os.environ.get("MONGO_URI")
#LOGGER.info("Eve using MONGO_URI={}".format(MONGO_URI))
else:
MONGO_HOST = "localhost"
MONGO_PORT = int(os.environ.get("MONGO_PORT", 27017))
LOGGER.info("Eve using MONGO_HOST={} and MONGO_PORT={}".format(MONGO_HOST, MONGO_PORT))
# Skip these if your db has no auth. But it really should.
#MONGO_USERNAME = '<your username>'
#MONGO_PASSWORD = '<your password>'
MONGO_DBNAME = 'pmap_nlp'
# Enable reads (GET), inserts (POST) and DELETE for resources/collections
# (if you omit this line, the API will default to ['GET'] and provide
# read-only access to the endpoint).
RESOURCE_METHODS = ['GET', 'POST']
# Enable reads (GET), edits (PATCH), replacements (PUT) and deletes of
# individual items (defaults to read-only item access).
ITEM_METHODS = ['GET', 'PUT']
PAGINATION = False

View File

@@ -0,0 +1,56 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
import requests
import json
import os
PAGINATION = False
if "FLASK_PORT" in os.environ:
EVE_URL = 'http://localhost:{}/'.format(os.environ["FLASK_PORT"])
else:
EVE_URL = 'http://localhost:5001/'
def get_collection_annotators(collection_id):
collection = requests.get(EVE_URL + 'collections/' + collection_id).json()
return collection["annotators"]
def get_document_annotations(doc_id):
annotations = requests.get(EVE_URL + 'annotations', params = {"where" : json.dumps({"document_id" : doc_id })}).json()["_items"]
return (annotations)
def update_document(document):
if "has_annotated" not in document.keys():
new_document = {"has_annotated": {} }
annotators = get_collection_annotators(document["collection_id"])
for annotator in annotators:
new_document["has_annotated"][annotator] = False
for annotation in get_document_annotations(document["_id"]):
print(annotation["document_id"])
new_document["has_annotated"][annotation["creator_id"]] = True
e_tag = document["_etag"]
headers = {"If-Match": e_tag}
print(requests.patch(EVE_URL + "documents/" + document["_id"], json= new_document, headers = headers))
DOCUMENT_PROJECTION = {
"projection": json.dumps({
"text": 0
})
}
if PAGINATION:
page = 0
documents = requests.get(EVE_URL + 'documents?page=' + str(page), params=DOCUMENT_PROJECTION).json()["_items"]
while len(documents) > 0:
page += 1
documents = requests.get(EVE_URL + 'documents?page=' + str(page), params=DOCUMENT_PROJECTION).json()["_items"]
for document in documents:
update_document(document)
else:
documents = requests.get(EVE_URL + "documents", params=DOCUMENT_PROJECTION).json()["_items"]
for document in documents:
update_document(document)

14
eve/setup_dev_data.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
set -x
if [[ -z ${MONGO_PORT} ]]; then
MONGO_PORT="27018"
fi
export MONGO_PORT
if [[ -z ${FLASK_PORT} ]]; then
FLASK_PORT="5001"
fi
export FLASK_PORT
pipenv run python3 test/EveClient.py

410
eve/test/EveClient.py Executable file
View File

@@ -0,0 +1,410 @@
# -*- coding: utf-8 -*-
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
"""
eve-demo-client
~~~~~~~~~~~~~~~
Simple and quickly hacked together, this script is used to reset the
eve-demo API to its initial state. It will use standard API calls to:
1) delete all items in the 'people' and 'works' collections
2) post multiple items in both collection
I guess it can also serve as a basic example of how to programmatically
manage a remote API using the phenomenal Requests library by Kenneth Reitz
(a very basic 'get' function is included even if not used).
:copyright: (c) 2015 by Nicola Iarocci.
:license: BSD, see LICENSE for more details.
"""
import sys
import json
import requests
import csv
import pprint
import time
import random
import os
from sklearn.datasets import fetch_20newsgroups
from random import randrange
from pymongo import MongoClient
if os.environ.get("FLASK_PORT"):
FLASK_PORT = int(os.environ.get("FLASK_PORT"))
else:
FLASK_PORT = 5000
if os.environ.get("MONGO_PORT"):
MONGO_PORT = int(os.environ.get("MONGO_PORT"))
else:
MONGO_PORT = 27017
ENTRY_POINT = '127.0.0.1:{}'.format(FLASK_PORT)
OVERLAP = .15
categories = ['alt.atheism', 'soc.religion.christian', 'comp.graphics', 'sci.med']
data = fetch_20newsgroups(subset='test', categories=categories, shuffle=True, random_state=42)
#data.data, data.target
def create_collection(userid, labels):
collection = [{
'creator_id': userid,
'annotators': [userid],
'viewers': [userid],
'labels':labels,
'metadata': {'title':'Trial Collection', 'description':'This is a sample description of a collection'},
'archived': False,
'configuration': {
'allow_overlapping_ner_annotations': True
}
}]
r = perform_post('collections', json.dumps(collection))
return get_ids(r)
def create_documents(collection_id, user_id, num_docs):
data = fetch_20newsgroups(subset='test', categories=categories, shuffle=False)
docs = []
for i in range(num_docs):
docs.append({
'creator_id': user_id,
'collection_id': collection_id,
'overlap': 0,
'text': data.data[i]
})
r = perform_post('documents', json.dumps(docs))
print('Created:', len(docs), 'documents')
return get_ids(r)
def create_annotations(user_id, collection_id, doc_ids, categories):
data = fetch_20newsgroups(subset='test', categories=categories, shuffle=False)
annotations = []
for i, doc_id in enumerate(doc_ids):
annotations.append({
'creator_id': user_id,
'collection_id': collection_id,
'document_id': doc_id,
'annotation': [categories[data.target[i]]]
})
r = perform_post('annotations', json.dumps(annotations))
print('Created:', len(annotations), 'annotations')
return get_ids(r)
def update_annotations(annotation_ids):
#perform_get('annotations?where={"document_id": {"$in": ["5b3531a9aec9104c8a9aca9e", "5b3531a9aec9104c8a9acaa0"]}}'
for id in annotation_ids:
url = 'http://'+ENTRY_POINT+'/annotations/'+id
response = requests.get(url, headers={'Content-Type': 'application/json'})
if response.status_code == 200:
r = response.json()
etag = r['_etag']
headers = {'Content-Type': 'application/json', 'If-Match': etag}
data = {'annotation': ['soc.religion.christian']}
requests.patch('http://'+ENTRY_POINT+'/annotations/' + id, json.dumps(data), headers=headers)
def create_pipeline():
pipeline = [
{
"_id": "5babb6ee4eb7dd2c39b9671c",
"title": "Apache OpenNLP Named Entity Recognition",
"description": "Apache's open source natural language processing toolkit for named entity recognition (NER). See https://opennlp.apache.org/ for more information. This is the default pipeline used for NER.",
"name":"opennlp",
"parameters":{
"cutoff":"integer",
"iterations":"integer"
}
},
{
"_id": "5babb6ee4eb7dd2c39b9671d",
"title": "SpaCy Named Entity Recognition",
"description": "spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more",
"name": "spaCy",
"parameters": {
"n_iter": "integer",
"dropout": "float"
}
},
{
"_id": "5babb6ee4eb7dd2c39b9671f",
"title": "Stanford CoreNLP Named Entity Recognition",
"description": "Stanford's natural language processing toolkit for named entity recognition (NER). See https://stanfordnlp.github.io/CoreNLP/ for more information.",
"name":"corenlp",
"parameters": {
"max_left":"integer",
"use_class_feature": [True, False],
"use_word": [True, False],
"use_ngrams": [True, False],
"no_mid_ngrams": [True, False],
"max_ngram_length":"integer",
"use_prev": [True, False],
"use_next": [True, False],
"use_disjunctive": [True, False],
"use_sequences": [True, False],
"use_prev_sequences": [True, False],
"use_type_seqs": [True, False],
"use_type_seqs2": [True, False],
"use_type_y_sequences": [True, False]
}
}
]
r = perform_post('pipelines', json.dumps(pipeline))
return get_ids(r)
def create_user():
users = []
users.append(
{
'_id':'bchee1',
'firstname': "Brant",
'lastname': 'Chee',
'email': 'bchee1@jhmi.edu',
'description': "Brant Developer",
'role': ['user']
})
users.append({
'_id': 'bchee2',
'firstname': "Brant",
'lastname': 'Chee',
'email': 'bchee2@jhmi.edu',
'description': "Brant administrator",
'role': ['administrator']
})
users.append(
{
'_id': 'lglende1',
'firstname': "Laura",
'lastname': 'Glendenning',
'email': 'lglende1@jh.edu',
'description': "Developer Laura",
'role': ['user']
})
users.append(
{
'_id': 'cahn9',
'firstname': "Charles",
'lastname': 'Ahn',
'email': 'cahn9@jh.edu',
'description': "Developer Charles",
'role': ['user']
})
r = perform_post('users', json.dumps(users))
return get_ids(r)
def create_classifier(collection_id, overlap, pipeline_id, labels):
'''{
'collection_id': {'type': 'objectid', 'required': True},
'overlap': {'type': 'float', 'required': True},
'pipeline_id': {'type': 'objectid', 'required': True},
'parameters': {'type': 'dict'}'''
classifier_obj = {'collection_id':collection_id,
'overlap':overlap,
'pipeline_id':pipeline_id,
'parameters':{"cutoff":1, "iterations":100},
'labels':labels
}
r = perform_post('classifiers', json.dumps(classifier_obj))
return get_ids(r)
def create_metrics(collection_id, classifier_id):
# create metrics for classifier
metrics_obj = {"collection_id": collection_id,
"classifier_id": classifier_id,
"documents": list(),
"annotations": list()
}
metrics_resp = perform_post("metrics", json.dumps(metrics_obj))
return get_ids(metrics_resp)
def create_next_ids(classifier_id, ann_ids, docids, overlap):
num_overlap = int(len(docids) * overlap)
#we're lazy and taking to first n docs as overlap
#'classifier_id': {'type': 'objectid', 'required': True},
#'document_ids': {'type': 'list', 'required': True},
#'overlap_document_ids': {'type': 'dict', 'required': True}
overlap_obj = {'classifier_id':classifier_id, 'document_ids':docids[num_overlap:], 'overlap_document_ids':{}}
for id in ann_ids:
overlap_obj['overlap_document_ids'][id] = docids[0:num_overlap]
r = perform_post('next_instances', json.dumps(overlap_obj))
return get_ids(r)
def get_ids(response):
valids = []
#print("Response:", response)
if response.status_code == 201:
r = response.json()
if r['_status'] == 'OK':
if '_items' in r:
for obj in r['_items']:
if obj['_status'] == "OK":
valids.append(obj['_id'])
else:
valids.append(r['_id'])
return valids
def perform_get(resource, data):
headers = {'Content-Type': 'application/json'}
return requests.get(endpoint(resource), data, headers=headers)
def perform_post(resource, data):
headers = {'Content-Type': 'application/json'}
return requests.post(endpoint(resource), data, headers=headers)
def endpoint(resource):
url = 'http://%s/%s/' % (
ENTRY_POINT if not sys.argv[1:] else sys.argv[1], resource)
return url
def delete_database(mongourl, database):
client = MongoClient(mongourl)
client.drop_database(database)
def create_bionlp_annotations(bionlpfile, num_docs, pipeline_id, creator_id, annotator_ids):
docs, anns, stats = load_bionlp(bionlpfile, num_docs)
categories = list(stats.keys())
#Create collection
collection = [{
'creator_id': creator_id,
'annotators': annotator_ids,
'viewers': annotator_ids,
'labels':categories,
'metadata': {'title':'NER Test Collection', 'description':'This is a sample sample collection to test NER tasks'},
'archived': False,
'configuration': {
'allow_overlapping_ner_annotations': True
}
}]
r = perform_post('collections', json.dumps(collection))
collection_id = get_ids(r)
collection_id = collection_id[0]
print("collection_id", collection_id)
#Create documents
images = [
'https://upload.wikimedia.org/wikipedia/commons/thumb/0/08/Unequalized_Hawkes_Bay_NZ.jpg/600px-Unequalized_Hawkes_Bay_NZ.jpg',
'https://cdn.indreams.me/cdf00b6d4827cd66511bdc35e1ef2ea3_10',
'/static/apl.png',
'/static/campus.jpg'
]
upload = []
for i in range(len(docs)):
upload.append({
'creator_id': creator_id,
'collection_id': collection_id,
'overlap': 0,
'text': docs[i],
'metadata': { 'imageUrl': images[randrange(0,len(images))] }
})
r = perform_post('documents', json.dumps(upload))
print('Created:', len(upload), 'documents')
doc_ids = get_ids(r)
classifier_ids = create_classifier(collection_id, 0, pipeline_id, categories)
print('Classifier id', classifier_ids)
metrics_ids = create_metrics(collection_id, classifier_ids[0])
print('Metrics id', metrics_ids)
next_ids = create_next_ids(classifier_ids[0], annotator_ids, doc_ids, 0)
annotations = []
for i, doc_id in enumerate(doc_ids):
annotations.append({
'creator_id': annotator_ids[random.randrange(0, len(annotator_ids))],
'collection_id': collection_id,
'document_id': doc_id,
'annotation': anns[i]
})
r = perform_post('annotations', json.dumps(annotations))
print('Created:', len(annotations), 'annotations')
return get_ids(r)
def load_bionlp(csvname, limit):
docs = []
anns = []
stats = {}
sentences_per_doc = 10 # total of 47959 sentences
with open(csvname, 'r', encoding='utf-8', errors='ignore') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
doc_text = ''
doc_anns = []
sentence_id = 1
next(csv_reader)
for line in csv_reader:
if line[0] != '':
sentence_id = int(line[0].split(':')[1])
# avoids triggering on the first sentence
if sentence_id % sentences_per_doc == 1 and sentence_id >= sentences_per_doc: # once you have enough sentences per doc, append to list and clear doc_text/doc_anns
# print('Added case ' + cases[-1])
docs.append(doc_text)
doc_text = ''
anns.append(doc_anns)
doc_anns = []
if len(docs) > limit-2:
break
token = line[1]
# add token to text and record start/end char
start_char = len(doc_text)
doc_text += token
end_char = len(doc_text)
doc_text += ' '
if line[3] != 'O': # if label is not 'O'
label = line[3].split('-')[1] # has BILUO tages that we don't need ex. 'B-tag'
if label not in stats:
stats[label] = 0
if line[3].split('-')[0] == 'B':
stats[label] += 1 # only add if the label has the 'begin' tag otherwise labels spanning multiple tokens are added multiple times
doc_anns.append((start_char, end_char, label)) # add label to annotations
elif line[3].split('-')[0] == 'I':
# NOTE: assumes I-tags only ever follow B-tags, will break if not the case
doc_anns.append((doc_anns[-1][0], end_char,
label)) # if the label spans multiple tokens update the most recent annotation with the new end char
del doc_anns[-2]
line_count += 1
# add remaining sentence
docs.append(doc_text)
doc_text = ''
anns.append(doc_anns)
doc_anns = []
return docs, anns, stats
if __name__ == '__main__':
mongourl = 'mongodb://localhost:{}'.format(MONGO_PORT)
delete_database(mongourl, 'test') # old database
delete_database(mongourl, 'pmap_nlp')
#generate new data
user_ids = create_user()
pipeline_ids = create_pipeline()
collection_id = create_collection(user_ids[1], categories)
doc_ids = create_documents(collection_id[0], user_ids[1], 750)
classifier_ids = create_classifier(collection_id[0], OVERLAP, pipeline_ids[0], categories)
metrics_ids = create_metrics(classifier_ids[0],collection_id[0])
next_ids = create_next_ids(classifier_ids[0], [user_ids[1]], doc_ids, OVERLAP)
annotation_ids = create_annotations(user_ids[0], collection_id[0], doc_ids[int(len(doc_ids)/2):], categories)
print("collection_id=",collection_id[0])
print("classifier_id='", classifier_ids[0], "'")
print("metrics_id='", metrics_ids[0], "'")
#update_annotations(annotation_ids[int(len(annotation_ids)/2):])
collection_id2 = create_collection(user_ids[1], categories)
doc_ids2 = create_documents(collection_id2[0], user_ids[1], 500)
annotation_ids2 = create_annotations(user_ids[0], collection_id[0], doc_ids2 , categories)
update_annotations(annotation_ids2[0:int(len(annotation_ids2) / 2)])
print('user_ids=',user_ids)
print('pipeline_ids=',pipeline_ids)
print('collection_id=',collection_id)
print('doc_ids=',doc_ids)
print('classifier_ids=',classifier_ids)
print('next_ids=',next_ids)
print('annotation_ids1=', annotation_ids)
print('annotation_ids2=',annotation_ids2)
print('collection_id2=', collection_id2)
print('doc_ids2=', doc_ids2)
ner_file = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'ner_dataset.csv')
#user_ids = create_user()
#pipeline_ids = create_pipeline()
#print('user_ids', user_ids)
#print('pipeline_ids', pipeline_ids)
create_bionlp_annotations(ner_file, 150, pipeline_ids[1], user_ids[0], user_ids)

1048576
eve/test/ner_dataset.csv Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
node_modules/

View File

@@ -0,0 +1,14 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# Editor configuration, see http://editorconfig.org
root = true
[*]
charset = utf-8
indent_style = space
indent_size = 2
insert_final_newline = true
trim_trailing_whitespace = true
[*.md]
max_line_length = off
trim_trailing_whitespace = false

36
frontend/annotation/.gitignore vendored Normal file
View File

@@ -0,0 +1,36 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# See http://help.github.com/ignore-files/ for more about ignoring files.
# compiled output
/dist
/tmp
/out-tsc
# dependencies
/node_modules
# IDEs and editors
/.idea
.project
.classpath
.c9/
*.launch
.settings/
*.sublime-workspace
# IDE - VSCode
.vscode/*
# misc
/.sass-cache
/connect.lock
/coverage
/libpeerconnection.log
npm-debug.log
yarn-error.log
testem.log
/typings
# System Files
.DS_Store
Thumbs.db

View File

@@ -0,0 +1,69 @@
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
FROM ubuntu:18.04
ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get clean && \
apt-get -y update && \
apt-get -y install software-properties-common
RUN apt-get -y update && \
apt-get -y install git build-essential curl jq wget ruby gettext-base && \
gem install mustache
RUN wget https://nginx.org/keys/nginx_signing.key && \
apt-key add nginx_signing.key && \
rm nginx_signing.key && \
echo "deb https://nginx.org/packages/ubuntu/ bionic nginx" && \
apt-get -y remove nginx* && \
apt-get -y update && \
apt-get -y install nginx
ARG NODE_VERSION=10
RUN curl -sL https://deb.nodesource.com/setup_${NODE_VERSION}.x | bash -
RUN apt-get -y update && \
apt-get install -y nodejs
ARG ROOT_DIR=/nlp-web-app/frontend
ARG SERVER_TYPE=http
EXPOSE 80 443
RUN mkdir -p $ROOT_DIR
WORKDIR $ROOT_DIR
ADD angular.json $ROOT_DIR/
ADD package*.json $ROOT_DIR/
RUN npm install
ADD e2e/ $ROOT_DIR/e2e
ADD nginx/ $ROOT_DIR/nginx
ADD tsconfig.json $ROOT_DIR/
ADD tslint.json $ROOT_DIR/
RUN mkdir -p /etc/ssl/private/ /etc/ssl/certs/ /etc/nginx/snippets/
ADD nginx/certs/server.key /etc/ssl/private/
ADD nginx/certs/server.crt /etc/ssl/certs/
ADD nginx/certs/dhparam.pem /etc/nginx/
ADD nginx/snippets/* /etc/nginx/snippets/
RUN echo "---" > data.yml && \
echo "ROOT_DIR: $ROOT_DIR" >> data.yml && \
echo "---" >> data.yml
RUN mustache data.yml nginx/nlp-web-app.$SERVER_TYPE.mustache > nginx/nlp-web-app && \
rm -f nginx/nlp-web-app.$SERVER_TYPE.mustache && \
ln -s /etc/nginx/sites-available/nlp-web-app /etc/nginx/sites-enabled/ && \
rm -f /etc/nginx/sites-enabled/default
ADD docker_run.sh $ROOT_DIR/
ADD src/ $ROOT_DIR/src
RUN npm run prod
CMD ["./docker_run.sh"]

View File

@@ -0,0 +1,29 @@
&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# PINE
This project was generated with [Angular CLI](https://github.com/angular/angular-cli) version 6.0.8.
## Development server
Run `ng serve` for a dev server. Navigate to `http://localhost:4200/`. The app will automatically reload if you change any of the source files.
## Code scaffolding
Run `ng generate component component-name` to generate a new component. You can also use `ng generate directive|pipe|service|class|guard|interface|enum|module`.
## Build
Run `ng build` to build the project. The build artifacts will be stored in the `dist/` directory. Use the `--prod` flag for a production build.
## Running unit tests
Run `ng test` to execute the unit tests via [Karma](https://karma-runner.github.io).
## Running end-to-end tests
Run `ng e2e` to execute the end-to-end tests via [Protractor](http://www.protractortest.org/).
## Further help
To get more help on the Angular CLI use `ng help` or go check out the [Angular CLI README](https://github.com/angular/angular-cli/blob/master/README.md).

View File

@@ -0,0 +1,133 @@
{
"$schema": "./node_modules/@angular/cli/lib/config/schema.json",
"version": 1,
"newProjectRoot": "projects",
"projects": {
"pine": {
"root": "",
"sourceRoot": "src",
"projectType": "application",
"prefix": "app",
"schematics": {},
"architect": {
"build": {
"builder": "@angular-devkit/build-angular:browser",
"options": {
"outputPath": "dist/pine",
"index": "src/index.html",
"main": "src/main.ts",
"polyfills": "src/polyfills.ts",
"tsConfig": "src/tsconfig.app.json",
"assets": [
"src/favicon.ico",
"src/assets"
],
"styles": [
"src/styles.css",
"src/themes.scss"
],
"scripts": [
"node_modules/jquery/dist/jquery.min.js"
]
},
"configurations": {
"production": {
"fileReplacements": [
{
"replace": "src/environments/environment.ts",
"with": "src/environments/environment.prod.ts"
}
],
"optimization": true,
"outputHashing": "all",
"sourceMap": false,
"extractCss": true,
"namedChunks": false,
"aot": true,
"extractLicenses": true,
"vendorChunk": false,
"buildOptimizer": true
}
}
},
"serve": {
"builder": "@angular-devkit/build-angular:dev-server",
"options": {
"browserTarget": "pine:build"
},
"configurations": {
"production": {
"browserTarget": "pine:build:production"
}
}
},
"extract-i18n": {
"builder": "@angular-devkit/build-angular:extract-i18n",
"options": {
"browserTarget": "pine:build"
}
},
"test": {
"builder": "@angular-devkit/build-angular:karma",
"options": {
"main": "src/test.ts",
"polyfills": "src/polyfills.ts",
"tsConfig": "src/tsconfig.spec.json",
"karmaConfig": "src/karma.conf.js",
"styles": [
"src/styles.css",
"src/themes.scss"
],
"scripts": [
"node_modules/venn.js/venn.js"
],
"assets": [
"src/favicon.ico",
"src/assets"
]
}
},
"lint": {
"builder": "@angular-devkit/build-angular:tslint",
"options": {
"tsConfig": [
"src/tsconfig.app.json",
"src/tsconfig.spec.json"
],
"exclude": [
"**/node_modules/**"
]
}
}
}
},
"pine-e2e": {
"root": "e2e/",
"projectType": "application",
"architect": {
"e2e": {
"builder": "@angular-devkit/build-angular:protractor",
"options": {
"protractorConfig": "e2e/protractor.conf.js",
"devServerTarget": "pine:serve"
},
"configurations": {
"production": {
"devServerTarget": "pine:serve:production"
}
}
},
"lint": {
"builder": "@angular-devkit/build-angular:tslint",
"options": {
"tsConfig": "e2e/tsconfig.e2e.json",
"exclude": [
"**/node_modules/**"
]
}
}
}
}
},
"defaultProject": "pine"
}

6
frontend/annotation/dev_run.sh Executable file
View File

@@ -0,0 +1,6 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
set -x
npm start

View File

@@ -0,0 +1,36 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
if [[ -z ${BACKEND_SERVER} ]]; then
echo ""
echo ""
echo ""
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo "Please set BACKEND_SERVER environment variable"
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo ""
echo ""
echo ""
exit 1
fi
if [[ -z ${SERVER_NAME} ]]; then
echo ""
echo ""
echo ""
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo "Please set SERVER_NAME environment variable"
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo ""
echo ""
echo ""
exit 1
fi
set -e
export LOG_FORMAT_SNIPPET="${LOG_FORMAT_SNIPPET:-snippets/default-logging.conf}"
envsubst '${BACKEND_SERVER} ${SERVER_NAME} ${LOG_FORMAT_SNIPPET}' < nginx/nlp-web-app > /etc/nginx/sites-available/nlp-web-app
nginx -g 'daemon off;'

View File

@@ -0,0 +1,30 @@
/*(C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC. */
// Protractor configuration file, see link for more information
// https://github.com/angular/protractor/blob/master/lib/config.ts
const { SpecReporter } = require('jasmine-spec-reporter');
exports.config = {
allScriptsTimeout: 11000,
specs: [
'./src/**/*.e2e-spec.ts'
],
capabilities: {
'browserName': 'chrome'
},
directConnect: true,
baseUrl: 'http://localhost:4200/',
framework: 'jasmine',
jasmineNodeOpts: {
showColors: true,
defaultTimeoutInterval: 30000,
print: function() {}
},
onPrepare() {
require('ts-node').register({
project: require('path').join(__dirname, './tsconfig.e2e.json')
});
jasmine.getEnv().addReporter(new SpecReporter({ spec: { displayStacktrace: true } }));
}
};

View File

@@ -0,0 +1,16 @@
/*(C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC. */
import { AppPage } from './app.po';
describe('workspace-project App', () => {
let page: AppPage;
beforeEach(() => {
page = new AppPage();
});
it('should display welcome message', () => {
page.navigateTo();
expect(page.getParagraphText()).toEqual('Welcome to PINE!');
});
});

View File

@@ -0,0 +1,13 @@
/*(C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC. */
import { browser, by, element } from 'protractor';
export class AppPage {
navigateTo() {
return browser.get('/');
}
getParagraphText() {
return element(by.css('app-root h1')).getText();
}
}

View File

@@ -0,0 +1,13 @@
{
"extends": "../tsconfig.json",
"compilerOptions": {
"outDir": "../out-tsc/app",
"module": "commonjs",
"target": "es5",
"types": [
"jasmine",
"jasminewd2",
"node"
]
}
}

View File

@@ -0,0 +1,17 @@
#!/bin/bash
# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
# based on
# https://www.digitalocean.com/community/tutorials/how-to-create-a-self-signed-ssl-certificate-for-nginx-in-ubuntu-18-04
set -x
# 5 years
DAYS="1825"
openssl req -x509 -nodes -days ${DAYS} \
-newkey rsa:2048 -keyout nginx/certs/server.key \
-out nginx/certs/server.crt \
-subj '/C=US/ST=Maryland/L=Laurel/O=JHU\/APL/OU=PMAP\/NLP/CN=pmap-nlp'
openssl dhparam -out nginx/certs/dhparam.pem 4096

View File

@@ -0,0 +1,13 @@
-----BEGIN DH PARAMETERS-----
MIICCAKCAgEA/hPh4yCR2jfQmShbhtSIGNgDbpJocSvGzhDdgNwV5a2kyidXWe6r
Ab6rbcjs78K3NQ+ZeNdHCTYEUDVD9KGEeTwUhASNEJTk1eDPT9Ll704pvHFkwL45
0PAGWStAPFoHhgmebkL9a3QarL1OUZ3u0u/0zoIo2Rqp6NYj/AMa5XqzG1aItxoV
c1tI1A1cdCiU0+X3nqXG5uFFn06faiqaY2Ykn5NXMC33VIyxX8xIeWg7zUXyLEj3
5oN2ssVyWkbJ3eoCK+tx3rIFDcSwXKBLaKKDYQNKvH9Uwuq9nWtSAAuX2PkdJX/+
Seide6Ek/YJa7nONLQ0s7C9FjOJIGnB9CV7wHicSgjXSOb7wnPM56hSoCr6Emsoo
k1N7Ck9iFI5v6abdJ5AB/Mo8swhp1mVmcuwMaF8GOGGTrB1vabOUKPFKO8hSSCOQ
qrtkpczH8o958AEK7I67awA10QgnISQ5Hh2HWdQxzmHVgFrYCs0Za0D0vi8lqcW5
LVSddLC0S+WvhRezymMWV+nVyypUh7dzRtyd2ausOs9nGbG1zbjAKmbkcYnHerXn
rh/T2h6EE/ZH7gdkO9niOTh2mq2G9lrV4LC648S2i7EzVYtP8qAL8zap/+7BTZl8
XuG8+WZ1bKUbZsRINMz+Nt+8HrSyFkFWjQWdfzlsYDciVTZ3O3Kuv4sCAQI=
-----END DH PARAMETERS-----

View File

@@ -0,0 +1,22 @@
-----BEGIN CERTIFICATE-----
MIIDqDCCApCgAwIBAgIJAPt+RGFSlFbOMA0GCSqGSIb3DQEBCwUAMGkxCzAJBgNV
BAYTAlVTMREwDwYDVQQIDAhNYXJ5bGFuZDEPMA0GA1UEBwwGTGF1cmVsMRAwDgYD
VQQKDAdKSFUvQVBMMREwDwYDVQQLDAhQTUFQL05MUDERMA8GA1UEAwwIcG1hcC1u
bHAwHhcNMTgxMTEzMTgxNDQxWhcNMjMxMTEyMTgxNDQxWjBpMQswCQYDVQQGEwJV
UzERMA8GA1UECAwITWFyeWxhbmQxDzANBgNVBAcMBkxhdXJlbDEQMA4GA1UECgwH
SkhVL0FQTDERMA8GA1UECwwIUE1BUC9OTFAxETAPBgNVBAMMCHBtYXAtbmxwMIIB
IjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAy0lBDJSEAr7Hlq/zhkSTrk/D
stoWgQLcoBOmBfs4RZHlKDeS/FOXs3fwInyNbapclOHUb1kxPdHpsoWaOSfd71xn
PQuoFrGKu56NOYlLWwJzFJ/9dYutYPAG7/ZR1tlUZKaJh0E9Lt05UcN9Ugp0LZtb
9vUfRvhwOhrc4aQrTRzoxAK4gNsBxe876Stlt7ScsX6qGxrY1VGH0CjHnUdT5Ivh
/rgZ57XWbzG/PiIdHbL8ZDQfku3RMvDe4l7v+Fv5U/sn4V/CAgGgjEwICfwaqXUB
QYwtiuAkPLrj/Kn/uhy1CvuP4BSfTvfnLWTSFWPmYEbWkkT1vANL91PpYywazQID
AQABo1MwUTAdBgNVHQ4EFgQUODsPEQPDvFV6+j7CqGv0kDZBDgswHwYDVR0jBBgw
FoAUODsPEQPDvFV6+j7CqGv0kDZBDgswDwYDVR0TAQH/BAUwAwEB/zANBgkqhkiG
9w0BAQsFAAOCAQEAysGLGV2FZF0v68P0Gxc3gA9PdGVekjMNewsU9F/3fQCqUK7l
kdkxut6X3cSqf0pER5efr6Ow7n0TojPmca9I4eqnK+dxZkeTluZX3clXLLCkqsp6
aeoCK70y9oLsTjMPT17y5/osYtVt8ep4lj6Qzuj7ikLK52Iee4FRSj++9UiHl5t7
q20XxOa2Vxl0Rt6riE3ADykkOnc0RqYT0FFimEBLaIHIJXStMDCXJjHJh3lAZY0R
9FwTcK/5JsRs8+qJyo4Q+foFi697azp7ui3jPD+lPv5/64YUbEe3KyPXBa2wqEz5
DYWAM4n1rfTOn74qecebTlIJQ2sEJYHIGU38+A==
-----END CERTIFICATE-----

View File

@@ -0,0 +1,28 @@
-----BEGIN PRIVATE KEY-----
MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQDLSUEMlIQCvseW
r/OGRJOuT8Oy2haBAtygE6YF+zhFkeUoN5L8U5ezd/AifI1tqlyU4dRvWTE90emy
hZo5J93vXGc9C6gWsYq7no05iUtbAnMUn/11i61g8Abv9lHW2VRkpomHQT0u3TlR
w31SCnQtm1v29R9G+HA6GtzhpCtNHOjEAriA2wHF7zvpK2W3tJyxfqobGtjVUYfQ
KMedR1Pki+H+uBnntdZvMb8+Ih0dsvxkNB+S7dEy8N7iXu/4W/lT+yfhX8ICAaCM
TAgJ/BqpdQFBjC2K4CQ8uuP8qf+6HLUK+4/gFJ9O9+ctZNIVY+ZgRtaSRPW8A0v3
U+ljLBrNAgMBAAECggEAa0GBMrQBWrlx8Q1wvXzdNnEbXfg3O2ZZJZR4WluL+xjZ
AXkg8kTgm25Cos94h04FfwAP55f1pRpl5S0ci99+91WXmtvVmfOesRMcjCjmO2R1
d4JaZnSFy8mYv28FCwirwFcl5NkFAP7zyTINowWk+pMn2IrIL9fQzrdxpxPJTOtt
IlCRcURlEmmwiphiKz9vWZ410JGGLZIIhygLyVGX6NEUm1wuYS5WuNmeEfyP8694
HJslz9DYfyUlaRiW3rWNml7aWHYvkxt5h1N3gf/tD546fWTOa0yQyhQo604kWytD
2FuKdx1vDv0IbsNoajrXAgQ/AeF/dN5HKN2GvXXZeQKBgQDkTgeukin2Pm9nXhD1
zxOqnc2Af4s1PMV5/e/IovVZWtrVWyZlVJBhgrslMvjKj2gvaXon0Hw5U4K4W0+B
VQ4fckqGNG22A6XRUmiX0pO5OEPICmNfyyN1NWituxrlrEsMQR800bjakEA5FC26
ifI31jusySXSmaBOj/UViEzEbwKBgQDj8kYNqj1TNFoQqHlFXuGS/iLbDNrpr0J2
Qv1B1ip21yewKlKhCL0vkr6aE3493lQwnaeLcD6f+Op7S+QfJM3MBdDgSPqTUii4
Yzy/pUuh13SUiJ5UPbkwRmg8zgavurHYnkXwZUquCuY85lMJ/7cHcxVT/+8MdkGO
In9cl1zKgwKBgQDIkYCQLdpteWZXkj0mJdjqMB4EwIgkqhH23U8VnYwcBwRvIde2
d7cr4zTUNlZ5ZckqtehaJ/+qQSJ7IcTUI0v39mlgQ5kKqWO4ZER89MNQmgx6Jh4t
XwH0i4o97j1v/pAj4OYwefqDEO1K995AncXMpgng/wmaXdqGilPOqeJ/QwKBgGOl
IDyPA/ngc9K+Yy0RGhjw4XnSd8wZ4jric+WY4r1Ktr3K8o4UzOcEBjBCfzg6faE2
+ev5qFa0MISvm0yGATTEAhhZrrhB/S0FrKO2dYaNMhhQVK5MwSy6SozyH3goa+Be
6AH7tZa5iwZqRTikwXUPOO6cffp7o5Knv/dQ765TAoGARNGjmkneuEGmd+vgrkll
CzjpMFBNcFNl/84l0RHKlu2mAcJNmMgnvFaluGnIBDwk1xdMcOrfmczJ+L3pIE36
lGu4C8CUvsKKHd6uq5L04nh+Myl6hMf+T8shTOSpbtYRDwT6TldsQ7dkaTkKVElF
4HN4HZSEjFUt52drMM20O0E=
-----END PRIVATE KEY-----

View File

@@ -0,0 +1,24 @@
include ${LOG_FORMAT_SNIPPET};
server {
listen 80;
server_name ${SERVER_NAME};
client_max_body_size 0;
location / {
root {{ROOT_DIR}}/dist/pine;
index index.html;
expires -1;
add_header Pragma "no-cache";
add_header Cache-Control "no-store, no-cache, must-revalidate, post-check=0, pre-check=0";
try_files $uri$args $uri$args/ $uri $uri/ /index.html =404;
}
location /api {
include proxy_params;
rewrite ^/api(.*) $1 break;
proxy_pass ${BACKEND_SERVER};
}
}

View File

@@ -0,0 +1,34 @@
include ${LOG_FORMAT_SNIPPET};
server {
listen 443 ssl;
include snippets/self-signed.conf;
include snippets/ssl-params.conf;
server_name ${SERVER_NAME};
client_max_body_size 0;
location / {
root {{ROOT_DIR}}/dist/pine;
index index.html;
expires -1;
add_header Pragma "no-cache";
add_header Cache-Control "no-store, no-cache, must-revalidate, post-check=0, pre-check=0";
try_files $uri$args $uri$args/ $uri $uri/ /index.html =404;
}
location /api {
include proxy_params;
rewrite ^/api(.*) $1 break;
proxy_pass ${BACKEND_SERVER};
}
}
server {
listen 80;
server_name ${SERVER_NAME};
return 302 https://$server_name$request_uri;
}

Some files were not shown because too many files have changed in this diff Show More