Add development updates.

Including: 1. Generated python documentat in docs/. 2. Starting a new python client in client/. 3. Moving testing data to test/. 4. The addition of Cypress UI tests and pytest tests in test/. 5. A number of bug fixes and improvements.
2026-01-08 20:17:54 -05:00 · 2020-07-30 11:49:53 -04:00
parent f93826e252
commit d6aa00330d
432 changed files with 1246355 additions and 1050198 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -0,0 +1,32 @@
+# Git
+**/.git/
+
+# Docker
+**/docker-compose*
+/setup_docker_test_data.sh
+
+# Node
+**/node_modules/
+/frontend/annotation/dist/
+
+# Eclipse
+**/.project
+**/.pydevproject
+**/.settings/
+
+# Python
+**/__pycache__
+
+# Database
+/eve/db/
+
+# Logs
+backend/pine/backend/logs
+pipelines/JavaNER/pmap_api/logs
+
+# Misc
+/local_data/
+/instance/
+/redis/data
+**/Dockerfile
+/results/
--- a/.env
+++ b/.env
@@ -0,0 +1,31 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+REDIS_PORT=6379
+EVE_PORT=7510
+BACKEND_PORT=7520
+MONGO_PORT=27018
+
+EVE_DB_VOLUME=eve_db
+
+OPENNLP_ID=5babb6ee4eb7dd2c39b9671c
+CORENLP_ID=5babb6ee4eb7dd2c39b9671d
+DOCUMENT_CLASSIFIER_ID=5babb6ee4eb7dd2c39b9671b
+
+EXPOSED_SERVER_TYPE=https
+EXPOSED_SERVER_NAME=localhost
+EXPOSED_PORT=8888
+
+EXPOSED_SERVER_TYPE_PROD=http
+EXPOSED_SERVER_NAME_PROD=annotation
+EXPOSED_PORT_PROD=80
+
+AUTH_MODULE=vegas
+#MONGO_URI=
+#VEGAS_CLIENT_SECRET=
+
+# Change these to be volume names instead of paths if you want to use docker volumes
+# If SHARED_VOLUME is a docker volume, be sure it is populated with the contents of ./shared
+SHARED_VOLUME=./shared
+MODELS_VOLUME=./local_data/models
+LOGS_VOLUME=./local_data/logs
+DOCUMENT_IMAGE_VOLUME=./local_data/test_document_images
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,62 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+# Node
+**/node_modules/
+
+# IDEs
+**/.project
+**/.pydevproject
+**/.settings/
+**/.idea/
+
+# Python
+**/__pycache__
+
+# Logs
+backend/pine/backend/logs
+pipelines/pine/pipelines/logs
+
+# Source https://github.com/helm/charts/blob/c194bce22cf6eae521bdd79d12ee04d9b1cd3e50/.gitignore
+# General files for the project
+pkg/*
+*.pyc
+bin/*
+.project
+/.bin
+/_test/secrets/*.json
+
+# OSX leaves these everywhere on SMB shares
+._*
+
+# OSX trash
+.DS_Store
+
+# Files generated by JetBrains IDEs, e.g. IntelliJ IDEA
+.idea/
+*.iml
+
+# Vscode files
+.vscode
+
+# Emacs save files
+*~
+\#*\#
+.\#*
+
+# Vim-related files
+[._]*.s[a-w][a-z]
+[._]s[a-w][a-z]
+*.un~
+Session.vim
+.netrwhist
+
+# Chart dependencies
+**/charts/*.tgz
+
+.history
+
+# Pipelines and local data
+/pipelines/models/
+/local_data/*
+!/local_data/dev/test_images/static/
+/results/
--- a/README.md
+++ b/README.md
@@ -1,49 +1,65 @@
+# PINE (Pmap Interface for Nlp Experimentation)
+
+                ██████╗ ██╗███╗   ██╗███████╗
+                ██╔══██╗██║████╗  ██║██╔════╝
+                ██████╔╝██║██╔██╗ ██║█████╗  
+                ██╔═══╝ ██║██║╚██╗██║██╔══╝  
+                ██║     ██║██║ ╚████║███████╗
+                ╚═╝     ╚═╝╚═╝  ╚═══╝╚══════╝
+            Pmap Interface for Nlp Experimentation
+
 &copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

 ## About PINE
-PINE is a web-based tool for text annotation.  It enables annotation at the document level as well as over text spans (words).  The annotation facilitates generation of natural language processing (NLP) models to classify documents and perform named entity recognition.  Some of the features include:
+
+PINE is a web-based tool for text annotation.  It enables annotation at the document level as well
+as over text spans (words).  The annotation facilitates generation of natural language processing
+(NLP) models to classify documents and perform named entity recognition.  Some of the features
+include:
 
-* Generate models in Spacy, OpenNLP, or CoreNLP on the fly and rank documents using Active Learning to reduce annotation time. 
+* Generate models in Spacy, OpenNLP, or CoreNLP on the fly and rank documents using Active Learning
+  to reduce annotation time. 
 
 * Extensible framework - add NLP pipelines of your choice. 
 
-* Active Learning support - Out of the box active learning support (https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) with pluggable active learning methods ranking functions.
+* Active Learning support - Out of the box active learning support
+  (https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) with pluggable active learning
+  methods ranking functions.
 
-* Facilitates group annotation projects - view other people’s annotations, calculates inter-annotator agreement, displays annotation performance.
+* Facilitates group annotation projects - view other people’s annotations, calculates
+  inter-annotator agreement, displays annotation performance.
 
 * Enterprise authentication - integrate with your existing OAuth/Active Directory Servers.
 
-* Scalability - deploy in docker compose or a kubernetes cluster; ability to use database as a service such as CosmosDB.
+* Scalability - deploy in docker compose or a kubernetes cluster; ability to use database as a
+  service such as CosmosDB.
 
-PINE was developed under internal research and development (IRAD) funding at the [Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu/).  It was created to support the annotation needs of NLP tasks on the [precision medicine analytics platform (PMAP)](https://pm.jh.edu/) at Johns Hopkins.  
-
+PINE was developed under internal research and development (IRAD) funding at the
+[Johns Hopkins University Applied Physics Laboratory](https://www.jhuapl.edu/).  It was created to
+support the annotation needs of NLP tasks on the
+[precision medicine analytics platform (PMAP)](https://pm.jh.edu/) at Johns Hopkins.  

 ## Required Resources
-Note - download required resources and place in pipelines/pine/pipelines/resources
+
+Note - download required resources and place in `pipelines/pine/pipelines/resources`:
 * apache-opennlp-1.9.0
 * stanford-corenlp-full-2018-02-27
 * stanford-ner-2018-02-27

-These are required to build docker images for active learning
+Alternatively, you can use the provided convenience script:
+`./pipelines/download_resources.sh`

-## Configuring Logging
+These are required to build docker images for active learning.

-See logging configuration files in `./shared/`.  `logging.python.dev.json` is used with the
-dev stack; the other files are used in the docker containers.
-
-The docker-compose stack is currently set to bind the `./shared/` directory into the containers
-at run-time.  This allows for configuration changes of the logging without needing to rebuild
-containers, and also allows the python logging config to live in one place instead of spread out
-into each container.  This is controlled with the `${SHARED_VOLUME}` variable from `.env`.
-
-Log files will be stored in the `${LOGS_VOLUME}` variable from `.env`.  Pipeline models files will
-be stored in the `${MODELS_VOLUME}` variable from `./env`.
+----------------------------------------------------------------------------------------------------

 ## Development Environment

 First, refer to the various README files in the subproject directories for dependencies.
-
-Install the pipenv in pipelines.
+Alternatively, a convenience script is provided:
+```bash
+./setup_dev_stack.sh
+```

 Then a dev stack can be run with:
 ```bash
@@ -55,7 +71,7 @@ planning to use that auth module.

 The dev stack can be stopped with Ctrl-C.

-Sometimes (for me) mongod doesn't start in time or something.  If you see a connection
+Sometimes mongod doesn't seem to start in time.  If you see a connection
 error for mongod, just close it and try it again.

 Once the dev stack is up and running, the following ports are accessible:
@@ -63,6 +79,53 @@ Once the dev stack is up and running, the following ports are accessible:
 * `localhost:5000` hosts the backend
 * `localhost:5001` hosts the eve layer

+### Generating documentation
+
+1. See `docs/README.md` for information on required environment.
+2. `./generate_documentation.sh`
+3. Generated documentation can then be found in `./docs/build`.
+
+### Testing Data
+
+To import testing data, run the dev stack and then run:
+```bash
+./setup_dev_test_data.sh
+```
+
+*WARNING*: This script will remove any pre-existing data.  If you need to clear your database
+for other reasons, stop your dev stack and then `rm -rf local_data/dev/eve/db`.
+
+### Testing
+
+There are test cases written using Cypress; for more information, see `test/README.md`.
+
+The short version, to run the tests using the docker-compose stack:
+1. `test/build.sh`
+2. `test/run_docker_compose.sh --report`
+3. Check `./results/<timestamp>` (the script in the previous step will print out the exact path) for:
+   * `reports/report.html`: an HTML report of tests run and their status
+   * `screenshots/`: for any screenshots from failed tests
+   * `videos/`: for videos of all the tests that were run
+
+To use the interactive dashboard:
+1. `test/build.sh`
+2. `test/run_docker_compose.sh --dashboard`
+
+It is also possible to run the cypress container directly, or locally with the dev stack.  For more
+information, see `test/README.md`.
+
+### Versioning
+
+There are three versions being tracked:
+* overall version: environment variable PINE_VERSION based on the git tag/revision information (see `./version.sh`)
+* eve/database version: controlled in `eve/python/settings.py`
+* frontend version: controlled in `frontend/annotation/package.json`
+
+The eve/database version should be bumped up when the schema changes.  This will (eventually) be
+used to implement data migration.
+
+The frontend version is the least important.
+
 ### Using the copyright checking pre-commit hook

 The script `pre-commit` is provided as a helpful utility to make sure that new files checked into
@@ -74,27 +137,23 @@ installed manually:
 This hook greps for the copyright text in new files and gives you the option to abort if it is
 not found.

-### Clearing the database
-
-First, stop your dev stack.  Then `rm -rf eve/db` and start the stack again.
-
-### Importing test data
-
-Once running, test data can be imported with:
-```bash
-./setup_dev_data.sh
-```
-
-### Updating existing data
-
-If there is existing data in the database, it is possible that it needs to be
-migrated.  To do this, run the following once the system is up and running:
-```bash
-cd eve/python && python3 update_documents_annnotation_status.py
-```
+----------------------------------------------------------------------------------------------------

 ## Docker Environments

+*IMPORTANT*:
+
+For all the docker-compose environments, it is required to set a `PINE_VERSION` environment
+variable.  To do this, either prepend each docker-compose command:
+```bash
+PINE_VERSION=$(./version.sh) docker-compose ...
+```
+Or export it in your shell:
+```bash
+export PINE_VERSION=$(./version.sh)
+docker-compose ...
+```
+
 The docker environment is run using docker-compose.  There are two supported configurations: the
 default and the prod configuration.

@@ -105,25 +164,23 @@ To build the images for DEFAULT configuration:
 ```bash
 docker-compose build
 ```
+Or use the convenience script:
+```bash
+./run_docker_compose.sh --build
+```

 To run containers as daemons for DEFAULT configuration (remove -d flag to see logs):
 ```bash
-docker-compose up -d
+docker-compose up
 ```
-
 You may also want the `--abort-on-container-exit` flag which will make errors more apparent.

-With default settings, the webapp will now be accessible at https://localhost:8888
-
-To watch logs for DEFAULT configuration:
+Or use the convenience script:
 ```bash
-docker-compose logs -f
+./run_docker_compose.sh --up
 ```

-To bring containers down for DEFAULT configuration:
-```bash
-docker-compose down
-```
+With default settings, the webapp will now be accessible at `https://localhost:8888`

 ### Production Docker Environment

@@ -135,24 +192,31 @@ docker-compose -f docker-compose.yml -f docker-compose.prod.yml build

 Note that you probably need to update `.env` and add the `MONGO_URI` property.

-### Clearing the database
+### Test data

-Bring all the containers down.  Then do a `docker ps --all` and find the numeric ID of the eve
-container and remove it with `docker rm <id>`.  Then, remove the two eve volumes with
-`docker volume rm nlp_webapp_eve_db` and `docker volume rm nlp_webapp_eve_logs`.  Finally, restart
-your containers.
-
-### Importing test data
+To import test data, you need to run the docker-compose stack using the docker-compose.test.yml file:
+```bash
+docker-compose build
+docker-compose -f docker-compose.yml -f docker-compose.override.yml -f docker-compose.test.yml up
+```
+Or use the convenience script:
+```bash
+./run_docker_compose.sh --build
+./run_docker_compose.sh --up-test
+```

 Once the system is up and running:
 ```bash
 ./setup_docker_test_data.sh
 ```

-### Updating existing data
+Once the test data has been imported, you no longer need to use the docker-compose.test.yml file.

-If there is existing data in the database, it is possible that it needs to be
-migrated.  To do this, run the following once the system is up and running:
+If you need to clear the database, bring down the container and remove the `nlp_webapp_eve_db` and
+`nlp_webapp_eve_logs` volumes with `docker volume rm`.
+
+If you are migrating from very old PINE versions, it is possible that you need to migrate your
+data if you are seeing applications errors:
 ```bash
 docker-compose exec eve python3 python/update_documents_annnotation_status.py
 ```
@@ -184,6 +248,23 @@ docker-compose exec backend scripts/data/set_user_password.sh <email username> <

 Alternatively, there is an Admin Dashboard through the web interface.

+----------------------------------------------------------------------------------------------------
+
+## Misc Configuration
+
+### Configuring Logging
+
+See logging configuration files in `./shared/`.  `logging.python.dev.json` is used with the
+dev stack; the other files are used in the docker containers.
+
+The docker-compose stack is currently set to bind the `./shared/` directory into the containers
+at run-time.  This allows for configuration changes of the logging without needing to rebuild
+containers, and also allows the python logging config to live in one place instead of spread out
+into each container.  This is controlled with the `${SHARED_VOLUME}` variable from `.env`.
+
+Log files will be stored in the `${LOGS_VOLUME}` variable from `.env`.  Pipeline models files will
+be stored in the `${MODELS_VOLUME}` variable from `./env`.
+
 ### Collection/Document Images

 It is now possible to explore images in the "annotate document" page in the frontend UI.  The image
--- a/azure-pipeline-templates/README.md
+++ b/azure-pipeline-templates/README.md
@@ -0,0 +1,4 @@
+&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+* [![Prod Build Status](https://dev.azure.com/JH-PMAP/APPLICATIONS/_apis/build/status/oa-nlp_annotator/oa-nlp_annotator%20CI?branchName=master)](https://dev.azure.com/JH-PMAP/APPLICATIONS/_build/latest?definitionId=5&branchName=master)
+* [![Dev Build Status](https://dev.azure.com/JH-PMAP/APPLICATIONS/_apis/build/status/oa-nlp_annotator/oa-nlp_annotator%20CI?branchName=develop)](https://dev.azure.com/JH-PMAP/APPLICATIONS/_build/latest?definitionId=5&branchName=develop)
--- a/azure-pipeline-templates/deploy.yml
+++ b/azure-pipeline-templates/deploy.yml
@@ -5,6 +5,8 @@ parameters:
  appUrl: ""
  azureContainerRegistry: $(azureContainerRegistry)
  azureSubscriptionEndpointForSecrets: $(azureSubscriptionEndpointForSecrets)
+  backendStorageMountPath: "/mnt/azure"
+  backendStorageShareName: ""
  deployEnvironment: $(deployEnvironment)
  deploymentName: "CONTAINER_DEPLOY"
  helmChart: "pine-chart"
@@ -94,6 +96,10 @@ jobs:
                  image:
                    repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.backendImageName }}
                    tag: ${{ parameters.imageTag }}
+                  persistence:
+                    enabled: true
+                    shareName: ${{ parameters.backendStorageShareName }}
+                    mountPath: ${{ parameters.backendStorageMountPath }}
                nlpAnnotation:
                  image:
                    repository: ${{ parameters.azureContainerRegistry }}/${{ parameters.pipelineImageName }}
--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@@ -117,11 +117,15 @@ stages:
          backendImageName: $(backendImageName)
          frontendImageName: $(frontendImageName)
          pipelineImageName: $(pipelineImageName)
+          backendStorageShareName: "pine-files-dev"
          secrets:
            backend:
              VEGAS_CLIENT_SECRET: $(vegas-client-secret-dev)
            eve:
              MONGO_URI: $(mongo-uri-dev)
+            azure-secret:
+              azurestorageaccountname: $(azure-storage-account-name-dev)
+              azurestorageaccountkey: $(azure-storage-account-key-dev)
  - stage: deploy_to_prod
    displayName: Deploy to prod
    condition: and(succeeded(), eq(variables['build.sourceBranch'], 'refs/heads/master'))
@@ -140,8 +144,12 @@ stages:
          backendImageName: $(backendImageName)
          frontendImageName: $(frontendImageName)
          pipelineImageName: $(pipelineImageName)
+          backendStorageShareName: "pine-files-prod"
          secrets:
            backend:
              VEGAS_CLIENT_SECRET: $(vegas-client-secret-prod)
            eve:
              MONGO_URI: $(mongo-uri-prod)
+            azure-secret:
+              azurestorageaccountname: $(azure-storage-account-name-prod)
+              azurestorageaccountkey: $(azure-storage-account-key-prod)
--- a/backend/README.md
+++ b/backend/README.md
@@ -21,9 +21,6 @@ First-time setup:
 Running the server:
 * `./dev_run.sh`

-Once test data has been set up in the eve layer, the script `setup_dev_data.sh`
-can be used to set up data from the backend's perspective.
-
 ## Setup

 Before running, you must edit ../.env and set `VEGAS_CLIENT_SECRET` appropriately
--- a/backend/dev_run.sh
+++ b/backend/dev_run.sh
@@ -4,7 +4,7 @@
 DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
 CONFIG_FILE="${DIR}/pine/backend/config.py"

-if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
+if ([[ -z ${AUTH_MODULE} ]] || [[ ${AUTH_MODULE} == "vegas" ]]) && [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
    echo ""
    echo ""
    echo ""
@@ -19,4 +19,4 @@ fi

 export FLASK_APP="pine.backend"
 export FLASK_ENV="development"
-pipenv run flask run
+pipenv run flask run $@
--- a/backend/docker/config.py.template
+++ b/backend/docker/config.py.template
@@ -5,6 +5,8 @@ import json
 bind = "0.0.0.0:${PORT}"
 workers = ${WORKERS}
 accesslog = "-"
+timeout = 60
+limit_request_line = 0

 if "PINE_LOGGING_CONFIG_FILE" in os.environ and os.path.isfile(os.environ["PINE_LOGGING_CONFIG_FILE"]):
    with open(os.environ["PINE_LOGGING_CONFIG_FILE"], "r") as f:
--- a/backend/docker_run.sh
+++ b/backend/docker_run.sh
@@ -3,7 +3,7 @@

 GUNICORN_CONFIG_FILE="config.py"

-if [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
+if ([[ -z ${AUTH_MODULE} ]] || [[ ${AUTH_MODULE} == "vegas" ]]) && [[ -z ${VEGAS_CLIENT_SECRET} ]]; then
    echo ""
    echo ""
    echo ""
--- a/backend/pine/backend/init.py
+++ b/backend/pine/backend/init.py
@@ -1,3 +1,5 @@
 # (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

-from .app import create_app
+from .app import create_app, VERSION
+
+__version__ = VERSION
--- a/backend/pine/backend/annotations/bp.py
+++ b/backend/pine/backend/annotations/bp.py
@@ -6,7 +6,7 @@ import logging
 from flask import abort, Blueprint, jsonify, request
 from werkzeug import exceptions

-from .. import auth, log
+from .. import auth, collections, log
 from ..data import service
 from ..documents import bp as documents

@@ -19,7 +19,7 @@ CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS = "allow_overlapping_ner_annotations"

 bp = Blueprint("annotations", __name__, url_prefix = "/annotations")

-def check_document(doc_id):
+def check_document_by_id(doc_id: str):
    """
    Verify that a document with the given doc_id exists and that the logged in user has permissions to access the
    document
@@ -29,6 +29,10 @@ def check_document(doc_id):
    if not documents.user_can_view_by_id(doc_id):
        raise exceptions.Unauthorized()

+def check_document(doc: dict):
+    if not documents.user_can_view(doc):
+        raise exceptions.Unauthorized()
+
@bp.route("/mine/by_document_id/<doc_id>")
@auth.login_required
 def get_my_annotations_for_document(doc_id):
@@ -38,7 +42,7 @@ def get_my_annotations_for_document(doc_id):
    :param doc_id: str
    :return: Response
    """
-    check_document(doc_id)
+    check_document_by_id(doc_id)
    where = {
        "document_id": doc_id,
        "creator_id": auth.get_logged_in_user()["id"]
@@ -57,7 +61,7 @@ def get_others_annotations_for_document(doc_id):
    :param doc_id: str
    :return: str
    """
-    check_document(doc_id)
+    check_document_by_id(doc_id)
    where = {
        "document_id": doc_id,
        # $eq doesn't work here for some reason -- maybe because objectid?
@@ -77,7 +81,7 @@ def get_annotations_for_document(doc_id):
    :param doc_id: str
    :return: str
    """
-    check_document(doc_id)
+    check_document_by_id(doc_id)
    where = {
        "document_id": doc_id
    }
@@ -103,22 +107,15 @@ def get_current_annotation(doc_id, user_id):
    else:
        return None

-def is_ner_annotation(ann):
-    """
-    Verify that the provided annotation is in the valid format for an NER Annotation
-    :param ann: Any
-    :return: Bool
-    """
-    return (type(ann) is list or type(ann) is tuple) and len(ann) == 3
-
-def check_overlapping_annotations(document, ner_annotations):
-    ner_annotations.sort(key = lambda x: x[0])
-
-    resp = service.get("collections/" + document["collection_id"])
-    if not resp.ok:
-        abort(resp.status_code)
-    collection = resp.json()
+# def is_ner_annotation(ann):
+#     """
+#     Verify that the provided annotation is in the valid format for an NER Annotation
+#     :param ann: Any
+#     :return: Bool
+#     """
+#     return (type(ann) is list or type(ann) is tuple) and len(ann) == 3

+def check_overlapping_annotations(collection, ner_annotations):
    # if allow_overlapping_ner_annotations is false, check them
    if "configuration" in collection and CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS in collection["configuration"] and not collection["configuration"][CONFIG_ALLOW_OVERLAPPING_NER_ANNOTATIONS]:
        for idx, val in enumerate(ner_annotations):
@@ -127,113 +124,114 @@ def check_overlapping_annotations(document, ner_annotations):
            if val[0] < prev[1]:
                raise exceptions.BadRequest("Collection is configured not to allow overlapping annotations")

-@bp.route("/mine/by_document_id/<doc_id>/ner", methods = ["POST", "PUT"])
-@auth.login_required
-def save_ner_annotations(doc_id):
-    """
-    Save new NER annotations to the database as an entry for the logged in user, for the document. If there are already
-    annotations, use a patch request to update with the new annotations. If there are not, use a post request to create
-    a new entry.
-    :param doc_id: str
-    :return: str
-    """
-    if not request.is_json:
-        raise exceptions.BadRequest()
-    check_document(doc_id)
-    document = service.get_item_by_id("documents", doc_id, {
-        "projection": json.dumps({
-            "collection_id": 1,
-            "metadata": 1
-        })
-    })
-    annotations = request.get_json()
-    user_id = auth.get_logged_in_user()["id"]
-    annotations = [(ann["start"], ann["end"], ann["label"]) for ann in annotations]
-    check_overlapping_annotations(document, annotations) 
-    new_annotation = {
-        "creator_id": user_id,
-        "collection_id": document["collection_id"],
-        "document_id": doc_id,
-        "annotation": annotations
-    }
-    
-    current_annotation = get_current_annotation(doc_id, user_id)
-    if current_annotation != None:
-        if current_annotation["annotation"] == annotations:
-            return jsonify(True)
-        headers = {"If-Match": current_annotation["_etag"]}
-        
-        # add all the other non-ner labels
-        for annotation in current_annotation["annotation"]:
-            if not is_ner_annotation(annotation):
-                new_annotation["annotation"].append(annotation)
+# @bp.route("/mine/by_document_id/<doc_id>/ner", methods = ["POST", "PUT"])
+# @auth.login_required
+# def save_ner_annotations(doc_id):
+#     """
+#     Save new NER annotations to the database as an entry for the logged in user, for the document. If there are already
+#     annotations, use a patch request to update with the new annotations. If there are not, use a post request to create
+#     a new entry.
+#     :param doc_id: str
+#     :return: str
+#     """
+#     if not request.is_json:
+#         raise exceptions.BadRequest()
+#     check_document_by_id(doc_id)
+#     document = service.get_item_by_id("documents", doc_id, {
+#         "projection": json.dumps({
+#             "collection_id": 1,
+#             "metadata": 1
+#         })
+#     })
+#     annotations = request.get_json()
+#     user_id = auth.get_logged_in_user()["id"]
+#     annotations = [(ann["start"], ann["end"], ann["label"]) for ann in annotations]
+#     check_overlapping_annotations(document, annotations) 
+#     new_annotation = {
+#         "creator_id": user_id,
+#         "collection_id": document["collection_id"],
+#         "document_id": doc_id,
+#         "annotation": annotations
+#     }
+#     
+#     current_annotation = get_current_annotation(doc_id, user_id)
+#     if current_annotation != None:
+#         if current_annotation["annotation"] == annotations:
+#             return jsonify(True)
+#         headers = {"If-Match": current_annotation["_etag"]}
+#         
+#         # add all the other non-ner labels
+#         for annotation in current_annotation["annotation"]:
+#             if not is_ner_annotation(annotation):
+#                 new_annotation["annotation"].append(annotation)
+# 
+#         resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
+#     else:
+#         resp = service.post("annotations", json = new_annotation)
+#     
+#     if resp.ok:
+#         new_annotation["_id"] = resp.json()["_id"]
+#         log.access_flask_annotate_document(document, new_annotation)
+#         
+#     return jsonify(resp.ok)

-        resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
-    else:
-        resp = service.post("annotations", json = new_annotation)
-    
-    if resp.ok:
-        new_annotation["_id"] = resp.json()["_id"]
-        log.access_flask_annotate_document(document, new_annotation)
-        
-    return jsonify(resp.ok)
+# def is_doc_annotation(ann):
+#     """
+#     Verify that an annotation has the correct format (string)
+#     :param ann: Any
+#     :return: Bool
+#     """
+#     return isinstance(ann, str)

-def is_doc_annotation(ann):
-    """
-    Verify that an annotation has the correct format (string)
-    :param ann: Any
-    :return: Bool
-    """
-    return isinstance(ann, str)
-
-@bp.route("/mine/by_document_id/<doc_id>/doc", methods = ["POST", "PUT"])
-@auth.login_required
-def save_doc_labels(doc_id):
-    """
-    Save new labels to the database as an entry for the logged in user, for the document. If there are already
-    annotations/labels, use a patch request to update with the new labels. If there are not, use a post request to
-    create a new entry.
-    :param doc_id:
-    :return:
-    """
-    if not request.is_json:
-        raise exceptions.BadRequest()
-    check_document(doc_id)
-    document = service.get_item_by_id("documents", doc_id, {
-        "projection": json.dumps({
-            "collection_id": 1,
-            "metadata": 1
-        })
-    })
-    labels = request.get_json()
-    user_id = auth.get_logged_in_user()["id"]
-    new_annotation = {
-        "creator_id": user_id,
-        "collection_id": document["collection_id"],
-        "document_id": doc_id,
-        "annotation": labels
-    }
-    
-    current_annotation = get_current_annotation(doc_id, user_id)
-    if current_annotation != None:
-        if current_annotation["annotation"] == labels:
-            return jsonify(True)
-        headers = {"If-Match": current_annotation["_etag"]}
-        
-        # add all the other non-doc labels
-        for annotation in current_annotation["annotation"]:
-            if not is_doc_annotation(annotation):
-                new_annotation["annotation"].append(annotation)
-        
-        resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
-    else:
-        resp = service.post("annotations", json = new_annotation)
-
-    if resp.ok:
-        new_annotation = resp.json()["_id"]
-        log.access_flask_annotate_document(document, new_annotation)
-        
-    return jsonify(resp.ok)
+# @bp.route("/mine/by_document_id/<doc_id>/doc", methods = ["POST", "PUT"])
+# @auth.login_required
+# def save_doc_labels(doc_id):
+#     """
+#     Save new labels to the database as an entry for the logged in user, for the document. If there are already
+#     annotations/labels, use a patch request to update with the new labels. If there are not, use a post request to
+#     create a new entry.
+#     :param doc_id:
+#     :return:
+#     """
+#     if not request.is_json:
+#         raise exceptions.BadRequest()
+#     check_document_by_id(doc_id)
+#     document = service.get_item_by_id("documents", doc_id, {
+#         "projection": json.dumps({
+#             "collection_id": 1,
+#             "metadata": 1
+#         })
+#     })
+#     
+#     labels = request.get_json()
+#     user_id = auth.get_logged_in_user()["id"]
+#     new_annotation = {
+#         "creator_id": user_id,
+#         "collection_id": document["collection_id"],
+#         "document_id": doc_id,
+#         "annotation": labels
+#     }
+#     
+#     current_annotation = get_current_annotation(doc_id, user_id)
+#     if current_annotation != None:
+#         if current_annotation["annotation"] == labels:
+#             return jsonify(True)
+#         headers = {"If-Match": current_annotation["_etag"]}
+#         
+#         # add all the other non-doc labels
+#         for annotation in current_annotation["annotation"]:
+#             if not is_doc_annotation(annotation):
+#                 new_annotation["annotation"].append(annotation)
+#         
+#         resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
+#     else:
+#         resp = service.post("annotations", json = new_annotation)
+# 
+#     if resp.ok:
+#         new_annotation = resp.json()["_id"]
+#         log.access_flask_annotate_document(document, new_annotation)
+#         
+#     return jsonify(resp.ok)


 def set_document_to_annotated_by_user(doc_id, user_id):
@@ -242,16 +240,76 @@ def set_document_to_annotated_by_user(doc_id, user_id):
    document
    :param doc_id: str
    :param user_id: str
-    :return: Response | None
+    :return: whether the update succeeded
+    :rtype: bool
    """
-    document = service.get_item_by_id("/documents", doc_id)
+    document = service.get_item_by_id("/documents", doc_id, params={
+        "projection": json.dumps({
+            "has_annotated": 1
+        })
+    })
+    if "has_annotated" in document and user_id in document["has_annotated"] and document["has_annotated"][user_id]:
+        return True
    new_document = {
-        "has_annotated": document["has_annotated"]
+        "has_annotated": document["has_annotated"] if "has_annotated" in document else {}
    }
    new_document["has_annotated"][user_id] = True
    headers = {"If-Match": document["_etag"]}
    return service.patch(["documents", doc_id], json=new_document, headers=headers).ok

+def _make_annotations(body):
+    if not isinstance(body, dict) or "doc" not in body or "ner" not in body:
+        raise exceptions.BadRequest()
+    if (not isinstance(body["doc"], list) and not isinstance(body["doc"], tuple)) or \
+       (not isinstance(body["ner"], list) and not isinstance(body["ner"], tuple)):
+        raise exceptions.BadRequest()
+
+    doc_labels = body["doc"]
+    for ann in doc_labels:
+        if not isinstance(ann, str) or len(ann.strip()) == 0:
+            raise exceptions.BadRequest()
+
+    ner_annotations = body["ner"]
+    for (i, ann) in enumerate(ner_annotations):
+        if isinstance(ann, dict):
+            if "start" not in ann or "end" not in ann or "label" not in ann:
+                raise exceptions.BadRequest()
+            if not isinstance(ann["start"], int) or not isinstance(ann["end"], int) or \
+               not isinstance(ann["label"], str) or len(ann["label"].strip()) == 0:
+                raise exceptions.BadRequest()
+            ner_annotations[i] = (ann["start"], ann["end"], ann["label"])
+        elif isinstance(ann, list) or isinstance(ann, tuple):
+            if len(ann) != 3 or not isinstance(ann[0], int) or not isinstance(ann[1], int) or \
+               not isinstance(ann[2], str) or len(ann[2].strip()) == 0:
+                raise exceptions.BadRequest()
+        else:
+            raise exceptions.BadRequest()
+    ner_annotations.sort(key = lambda x: x[0])
+
+    return (doc_labels, ner_annotations)
+
+def _add_or_update_annotation(new_annotation):
+    doc_id = new_annotation["document_id"]
+    user_id = new_annotation["creator_id"]
+    current_annotation = get_current_annotation(doc_id, user_id)
+    success = False
+    if current_annotation != None:
+        new_annotation["_id"] = current_annotation["_id"]
+        if current_annotation["annotation"] == new_annotation["annotation"]:
+            return new_annotation["_id"]
+        else:
+            headers = {"If-Match": current_annotation["_etag"]}
+            resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
+    else:
+        updated_annotated_field = set_document_to_annotated_by_user(doc_id, user_id)
+        resp = service.post("annotations", json = new_annotation)
+        success = resp.ok and updated_annotated_field
+        new_annotation["_id"] = resp.json()["_id"]
+    
+    if success:
+        log.access_flask_annotate_document(new_annotation)
+
+    return new_annotation["_id"]

@bp.route("/mine/by_document_id/<doc_id>", methods = ["POST", "PUT"])
 def save_annotations(doc_id):
@@ -260,47 +318,96 @@ def save_annotations(doc_id):
    are already annotations, use a patch request to update with the new annotations. If there are not, use a post
    request to create a new entry.
    :param doc_id: str
-    :return: str
+    :return: bool
    """
+    # If you change input or output, update client modules pine.client.models and pine.client.client
    if not request.is_json:
        raise exceptions.BadRequest()
-    check_document(doc_id)
-    document = service.get_item_by_id("documents", doc_id, {
-        "projection": json.dumps({
+
+    document = service.get_item_by_id("documents", doc_id, params=service.params({
+        "projection": {
            "collection_id": 1,
            "metadata": 1
-        })
-    })
-    body = request.get_json()
-    if "doc" not in body or "ner" not in body:
-        raise exceptions.BadRequest()
-    labels = body["doc"]
-    annotations = [(ann["start"], ann["end"], ann["label"]) for ann in body["ner"]]
-    check_overlapping_annotations(document, annotations)
-    user_id = auth.get_logged_in_user()["id"]
+        }
+    }))
+    check_document(document)

+    body = request.get_json()
+    (doc_labels, ner_annotations) = _make_annotations(body)
+
+    collection = service.get_item_by_id("collections", document["collection_id"], params=service.params({
+        "projection": {
+            "configuration": 1
+        }
+    }))
+    check_overlapping_annotations(collection, ner_annotations)
+    
+    user_id = auth.get_logged_in_user()["id"]
    new_annotation = {
        "creator_id": user_id,
        "collection_id": document["collection_id"],
        "document_id": doc_id,
-        "annotation": labels + annotations
+        "annotation": doc_labels + ner_annotations
    }
+
+    return jsonify(_add_or_update_annotation(new_annotation))
+
+@bp.route("/mine/by_collection_id/<collection_id>", methods = ["POST", "PUT"])
+def save_collection_annotations(collection_id: str):
+    # If you change input or output, update client modules pine.client.models and pine.client.client
+    collection = service.get_item_by_id("collections", collection_id, params=service.params({
+        "projection": {
+            "configuration": 1,
+            "creator_id": 1,
+            "viewer": 1,
+            "annotators": 1
+        }
+    }))
+    if not collections.user_can_annotate(collection):
+        raise exceptions.Unauthorized()
+
+    if not request.is_json:
+        raise exceptions.BadRequest()
+    doc_annotations = request.get_json()
+    if not isinstance(doc_annotations, dict):
+        raise exceptions.BadRequest()
    
-    current_annotation = get_current_annotation(doc_id, user_id)
-    if current_annotation != None:
-        if current_annotation["annotation"] == new_annotation["annotation"]:
-            return jsonify(True)
-        headers = {"If-Match": current_annotation["_etag"]}
-        resp = service.patch(["annotations", current_annotation["_id"]], json = new_annotation, headers = headers)
-    else:
-        updated_annotated_field = set_document_to_annotated_by_user(doc_id, user_id)
-        resp = service.post("annotations", json = new_annotation)
+    skip_document_updates = json.loads(request.args.get("skip_document_updates", "false"))
    
+    # make sure all the documents actually belong to that collection
+    collection_ids = list(documents.get_collection_ids_for(doc_annotations.keys()))
+    if len(collection_ids) != 1 or collection_ids[0] != collection_id:
+        raise exceptions.Unauthorized()
+    user_id = auth.get_logged_in_user()["id"]
+
+    # first try batch mode
+    new_annotations = []
+    for (doc_id, body) in doc_annotations.items():
+        (doc_labels, ner_annotations) = _make_annotations(body)
+        check_overlapping_annotations(collection, ner_annotations)
+        new_annotations.append({
+            "creator_id": user_id,
+            "collection_id": collection_id,
+            "document_id": doc_id,
+            "annotation": doc_labels + ner_annotations
+        })
+    resp = service.post("annotations", json=new_annotations)
    if resp.ok:
-        new_annotation["_id"] = resp.json()["_id"]
-        log.access_flask_annotate_document(document, new_annotation)
-    
-    return jsonify(resp.ok)
+        for (i, created_annotation) in enumerate(resp.json()["_items"]):
+            new_annotations[i]["_id"] = created_annotation["_id"]
+            if not skip_document_updates:
+                set_document_to_annotated_by_user(new_annotations[i]["document_id"],
+                                                  new_annotations[i]["creator_id"])
+        log.access_flask_annotate_documents(new_annotations)
+        return jsonify([annotation["_id"] for annotation in new_annotations])
+
+    # fall back on individual mode
+    added_ids = []
+    for annotation in new_annotations:
+        added_id = _add_or_update_annotation(annotation["document_id"], user_id, annotation)
+        if added_id:
+            added_ids.append(added_id)
+    return jsonify(added_ids)

 def init_app(app):
    app.register_blueprint(bp)
--- a/backend/pine/backend/app.py
+++ b/backend/pine/backend/app.py
@@ -6,14 +6,18 @@ import os
 from . import log
 log.setup_logging()

-from flask import Flask, jsonify
+from flask import Flask, abort, jsonify
+from flask import __version__ as flask_version
 from werkzeug import exceptions

 from . import config

+VERSION = os.environ.get("PINE_VERSION", "unknown-no-env")
+LOGGER = logging.getLogger(__name__)
+
 def handle_error(e):
    logging.getLogger(__name__).error(e, exc_info=True)
-    return jsonify(e.description), e.code
+    return jsonify(str(e.description)), e.code

 def handle_uncaught_exception(e):
    if isinstance(e, exceptions.InternalServerError):
@@ -54,6 +58,23 @@ def create_app(test_config = None):
    def ping():
        return jsonify("pong")

+    from .data import service as service
+    @app.route("/about")
+    def about():
+        resp = service.get("about")
+        if not resp.ok:
+            abort(resp.status)
+        about = {
+            "version": VERSION,
+            "flask_version": flask_version,
+            "db": resp.json()
+        }
+        LOGGER.info("Eve service performance history:")
+        LOGGER.info(service.PERFORMANCE_HISTORY.pformat())
+        LOGGER.info("Version information:")
+        LOGGER.info(about)
+        return jsonify(about)
+
    from . import cors
    cors.init_app(app)

--- a/backend/pine/backend/auth/bp.py
+++ b/backend/pine/backend/auth/bp.py
@@ -69,6 +69,8 @@ def flask_get_login_form() -> Response:
@bp.route("/logout", methods = ["POST"])
 def flask_post_logout() -> Response:
    user = module.get_logged_in_user()
+    if user == None:
+        raise exceptions.BadRequest()
    module.logout()
    log.access_flask_logout(user)
    return Response(status = 200)
--- a/backend/pine/backend/auth/password.py
+++ b/backend/pine/backend/auth/password.py
@@ -5,11 +5,11 @@ import bcrypt
 import hashlib

 def hash_password(password: str) -> str:
-    sha256 = hashlib.sha256(password.encode()).digest()
+    sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
    hashed_password_bytes = bcrypt.hashpw(sha256, bcrypt.gensalt())
    return base64.b64encode(hashed_password_bytes).decode()

 def check_password(password: str, hashed_password: str):
-    sha256 = hashlib.sha256(password.encode()).digest()
+    sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
    hashed_password_bytes = base64.b64decode(hashed_password.encode())
    return bcrypt.checkpw(sha256, hashed_password_bytes)
--- a/backend/pine/backend/collections/init.py
+++ b/backend/pine/backend/collections/init.py
@@ -3,4 +3,4 @@
 """This module contains the api methods required to interact with, organize, create, and display collections in the
 front-end and store the collections in the backend"""

-from .bp import user_can_annotate, user_can_view, user_can_add_documents_or_images, user_can_modify_document_metadata, user_can_annotate_by_id, user_can_view_by_id, user_can_add_documents_or_images_by_id, user_can_modify_document_metadata_by_id
+from .bp import user_can_annotate, user_can_view, user_can_add_documents_or_images, user_can_modify_document_metadata, user_can_annotate_by_id, user_can_annotate_by_ids, user_can_view_by_id, user_can_add_documents_or_images_by_id, user_can_modify_document_metadata_by_id
--- a/backend/pine/backend/collections/bp.py
+++ b/backend/pine/backend/collections/bp.py
@@ -16,14 +16,27 @@ from .. import auth, log
 from ..data import service

 bp = Blueprint("collections", __name__, url_prefix = "/collections")
-logger = logging.getLogger(__name__)
+LOGGER = logging.getLogger(__name__)
+
+DOCUMENTS_PER_TRANSACTION = 500
+
+# Cache this info for uploading large numbers of images sequentially
+LAST_COLLECTION_FOR_IMAGE = None
+def is_cached_last_collection(collection_id):
+    global LAST_COLLECTION_FOR_IMAGE
+    return LAST_COLLECTION_FOR_IMAGE and LAST_COLLECTION_FOR_IMAGE[0] == collection_id and LAST_COLLECTION_FOR_IMAGE[1] == auth.get_logged_in_user()["id"]
+def update_cached_last_collection(collection_id):
+    global LAST_COLLECTION_FOR_IMAGE
+    LAST_COLLECTION_FOR_IMAGE = [collection_id, auth.get_logged_in_user()["id"]]

 def _collection_user_can_projection():
-    return {"projection": json.dumps({
-        "creator_id": 1,
-        "annotators": 1,
-        "viewers": 1
-    })}
+    return service.params({
+        "projection": {
+            "creator_id": 1,
+            "annotators": 1,
+            "viewers": 1
+        }
+    })

 def _collection_user_can(collection, annotate):
    user_id = auth.get_logged_in_user()["id"]
@@ -50,6 +63,21 @@ def user_can_annotate_by_id(collection_id):
    collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
    return _collection_user_can(collection, annotate = True)

+def user_can_annotate_by_ids(collection_ids):
+    collections = service.get_items("collections", params=service.params({
+        "where": {
+            "_id": {"$in": collection_ids}
+        }, "projection": {
+            "creator_id": 1,
+            "annotators": 1,
+            "viewers": 1
+        }
+    }))
+    for collection in collections:
+        if not user_can_annotate(collection):
+            return False
+    return True
+
 def user_can_view_by_id(collection_id):
    collection = service.get_item_by_id("/collections", collection_id, _collection_user_can_projection())
    return _collection_user_can(collection, annotate = False)
@@ -163,7 +191,7 @@ def get_collection(collection_id):
    :param collection_id: str
    :return: Response
    """
-    resp = service.get("collections/" + collection_id)
+    resp = service.get(["collections", collection_id])
    if not resp.ok:
        abort(resp.status_code)
    collection = resp.json()
@@ -175,7 +203,7 @@ def get_collection(collection_id):
@bp.route("/by_id/<collection_id>/download", methods = ["GET"])
@auth.login_required
 def download_collection(collection_id):
-    resp = service.get("/collections/" + collection_id)
+    resp = service.get(["collections", collection_id])
    if not resp.ok:
        abort(resp.status_code)
    collection = resp.json()
@@ -291,7 +319,7 @@ def add_annotator_to_collection(collection_id):
    if not (collection["creator_id"] == auth_id):
        raise exceptions.Unauthorized()
    if user_id not in collection["annotators"]:
-        logger.info("new annotator: adding to collection")
+        LOGGER.info("new annotator: adding to collection")
        collection["annotators"].append(user_id)
        if user_id not in collection["viewers"]:
            collection["viewers"].append(user_id)
@@ -324,7 +352,7 @@ def add_viewer_to_collection(collection_id):
    if not (collection["creator_id"] == auth_id):
        raise exceptions.Unauthorized()
    if user_id not in collection["viewers"]:
-        logger.info("new viewer: adding to collection")
+        LOGGER.info("new viewer: adding to collection")
        collection["viewers"].append(user_id)
        to_patch = {
            "viewers": collection["viewers"]
@@ -350,7 +378,7 @@ def add_label_to_collection(collection_id):
    if not (collection["creator_id"] == auth_id):
        raise exceptions.Unauthorized()
    if new_label not in collection["labels"]:
-        logger.info("new viewer: adding to collection")
+        LOGGER.info("new label: adding to collection")
        collection["labels"].append(new_label)
        to_patch = {
            "labels": collection["labels"]
@@ -375,6 +403,22 @@ def get_overlap_ids(collection_id):
    return [doc["_id"] for doc in service.get_all_using_pagination("documents", params)['_items']]


+def _upload_documents(collection, docs):
+    doc_resp = service.post("/documents", json=docs)
+    # TODO if it failed, roll back the created collection and classifier
+    if not doc_resp.ok:
+        abort(doc_resp.status_code, doc_resp.content)
+    r = doc_resp.json()
+    # TODO if it failed, roll back the created collection and classifier
+    if r["_status"] != "OK":
+        abort(400, "Unable to create documents")
+    for obj in r["_items"]:
+        if obj["_status"] != "OK":
+            abort(400, "Unable to create documents")
+    doc_ids = [obj["_id"] for obj in r["_items"]]
+    LOGGER.info("Added {} docs to collection {}".format(len(doc_ids), collection["_id"]))
+    return doc_ids
+
 # Require a multipart form post:
 # CSV is in the form file "file"
 # Optional images are in the form file fields "imageFileN" where N is an (ignored) index
@@ -387,6 +431,7 @@ def get_overlap_ids(collection_id):
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
 def create_collection():
+    # If you change the requirements here, also update the client module pine.client.models
    """
    Create a new collection based upon the entries provided in the POST request's associated form fields.
    These fields include:
@@ -394,7 +439,7 @@ def create_collection():
    overlap - ratio of overlapping documents. (0-1) with 0 being no overlap and 1 being every document has overlap, ex:
        .90 - 90% of documents overlap
    train_every - automatically train a new classifier after this many documents have been annotated
-    pipeline_id - the id value of the classifier pipeline associated with this collection (spacy, opennlp, corenlp)
+    pipelineId - the id value of the classifier pipeline associated with this collection (spacy, opennlp, corenlp)
    classifierParameters - optional parameters that adjust the configuration of the chosen classifier pipeline.
    archived - whether or not this collection should be archived.
    A collection can be created with documents listed in a csv file. Each new line in the csv represents a new document.
@@ -451,7 +496,7 @@ def create_collection():
    collection_id = r["_id"]
    collection["_id"] = collection_id
    log.access_flask_add_collection(collection)
-    logger.info("Created collection", collection_id)
+    LOGGER.info("Created collection {}".format(collection_id))

    #create classifier
    #   require collection_id, overlap, pipeline_id and labels
@@ -473,7 +518,7 @@ def create_collection():
    if r["_status"] != "OK":
        abort(400, "Unable to create classifier")
    classifier_id = r["_id"]
-    logger.info("Created classifier", classifier_id)
+    LOGGER.info("Created classifier {}".format(classifier_id))

    # create metrics for classifier
    # require collection_id, classifier_id, document_ids and annotations ids
@@ -492,7 +537,7 @@ def create_collection():
    if r["_status"] != "OK":
        abort(400, "Unable to create metrics")
    metrics_id = r["_id"]
-    logger.info("Created metrics", metrics_id)
+    LOGGER.info("Created metrics {}".format(metrics_id))

    #create documents if CSV file was sent in
    doc_ids = []
@@ -529,20 +574,12 @@ def create_collection():
            else:
                doc["overlap"] = 0
            docs.append(doc)
-        doc_resp = service.post("/documents", json=docs)
-        # TODO if it failed, roll back the created collection and classifier
-        if not doc_resp.ok:
-            abort(doc_resp.status_code, doc_resp.content)
-        r = doc_resp.json()
-        # TODO if it failed, roll back the created collection and classifier
-        if r["_status"] != "OK":
-            abort(400, "Unable to create documents")
-        logger.info(r["_items"])
-        for obj in r["_items"]:
-            if obj["_status"] != "OK":
-                abort(400, "Unable to create documents")
-        doc_ids = [obj["_id"] for obj in r["_items"]]
-        logger.info("Added docs:", doc_ids)
+            if len(docs) >= DOCUMENTS_PER_TRANSACTION:
+                doc_ids += _upload_documents(collection, docs)
+                docs = []
+        if len(docs) > 0:
+            doc_ids += _upload_documents(collection, docs)
+            docs = []

    # create next ids
    (doc_ids, overlap_ids) = get_doc_and_overlap_ids(collection_id)
@@ -568,8 +605,9 @@ def create_collection():

 def _check_collection_and_get_image_dir(collection_id, path):
    # make sure user can view collection
-    if not user_can_view_by_id(collection_id):
-        raise exceptions.Unauthorized()
+    if not is_cached_last_collection(collection_id):
+        if not user_can_view_by_id(collection_id):
+            raise exceptions.Unauthorized()

    image_dir = current_app.config["DOCUMENT_IMAGE_DIR"]
    if image_dir == None or len(image_dir) == 0:
@@ -582,6 +620,24 @@ def _check_collection_and_get_image_dir(collection_id, path):

    return os.path.realpath(image_dir)

+@bp.route("/static_images/<collection_id>", methods=["GET"])
+@auth.login_required
+def get_static_collection_images(collection_id):
+    static_image_dir = os.path.join(_check_collection_and_get_image_dir(collection_id, "static/"), "static")
+    urls = []
+    for _, _, filenames in os.walk(static_image_dir):
+        urls += ["/static/{}".format(f) for f in filenames]
+    return jsonify(urls)
+
+@bp.route("/images/<collection_id>", methods=["GET"])
+@auth.login_required
+def get_collection_images(collection_id):
+    collection_image_dir = _check_collection_and_get_image_dir(collection_id, "")
+    urls = []
+    for _, _, filenames in os.walk(collection_image_dir):
+        urls += ["/{}".format(f) for f in filenames]
+    return jsonify(urls)
+
@bp.route("/image/<collection_id>/<path:path>", methods=["GET"])
@auth.login_required
 def get_collection_image(collection_id, path):
@@ -639,11 +695,14 @@ def _upload_collection_image_file(collection_id, path, image_file):
@bp.route("/image/<collection_id>/<path:path>", methods=["POST", "PUT"])
@auth.login_required
 def post_collection_image(collection_id, path):
-    if not user_can_add_documents_or_images_by_id(collection_id):
-        raise exceptions.Unauthorized()
+    if not is_cached_last_collection(collection_id):
+        if not user_can_add_documents_or_images_by_id(collection_id):
+            raise exceptions.Unauthorized()
    if "file" not in request.files:
        raise exceptions.BadRequest("Missing file form part.")

+    update_cached_last_collection(collection_id)
+
    return jsonify(_upload_collection_image_file(collection_id, path, request.files["file"]))

 def init_app(app):
--- a/backend/pine/backend/config.py
+++ b/backend/pine/backend/config.py
@@ -24,7 +24,8 @@ else:
 REDIS_PORT = int(os.environ.get("REDIS_PORT", 6479))

 AUTH_MODULE = os.environ.get("AUTH_MODULE", "vegas")
+if not AUTH_MODULE: AUTH_MODULE = "vegas"

 VEGAS_CLIENT_SECRET = os.environ.get("VEGAS_CLIENT_SECRET", None)

-DOCUMENT_IMAGE_DIR = os.environ.get("DOCUMENT_IMAGE_DIR")
+DOCUMENT_IMAGE_DIR = os.environ.get("DOCUMENT_IMAGE_DIR", "/mnt/azure")
--- a/backend/pine/backend/data/service.py
+++ b/backend/pine/backend/data/service.py
@@ -3,7 +3,7 @@
 import json
 import logging
 import math
-from pprint import pprint
+from pprint import pformat, pprint
 import threading

 from flask import abort, current_app, Response
@@ -23,6 +23,9 @@ class PerformanceHistory(object):
        }
        self.lock = threading.Lock()

+    def pformat(self, **kwargs):
+        return pformat(self.data, **kwargs)
+
    def pprint(self):
        self.lock.acquire()
        try:
@@ -53,6 +56,7 @@ class PerformanceHistory(object):
 PERFORMANCE_HISTORY = PerformanceHistory()

 def _standardize_path(path, *additional_paths):
+    # if you change this, also update client code in pine.client.client module
    if type(path) not in [list, tuple, set]:
        path = [path]
    if additional_paths:
--- a/backend/pine/backend/data/users.py
+++ b/backend/pine/backend/data/users.py
@@ -12,7 +12,16 @@ def get_all_users():
    return service.get_items("/users")

 def get_user(user_id):
-    return service.get_item_by_id("/users", user_id)
+    # getting by ID in the normal way doesn't work sometimes
+    items = service.get_items("users", params=service.params({
+        "where": {
+            "_id": user_id
+        }
+    }))
+    if items and len(items) == 1:
+        return items[0]
+    else:
+        abort(404)

 def get_user_by_email(email):
    where = {
@@ -107,7 +116,7 @@ def reset_user_passwords():
        service.remove_nonupdatable_fields(user)
        headers = {"If-Match": etag}
        click.echo("Putting to {}: {}".format(service.url("users", user["_id"]), user))
-        resp = service.put("users", user["_id"], json = user, headers = headers)
+        resp = service.put(["users", user["_id"]], json = user, headers = headers)
        if not resp.ok:
            click.echo("Failure! {}".format(resp))
        else:
--- a/backend/pine/backend/documents/bp.py
+++ b/backend/pine/backend/documents/bp.py
@@ -1,6 +1,7 @@
 # (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.

 import json
+import random

 from flask import abort, Blueprint, jsonify, request
 from werkzeug import exceptions
@@ -15,6 +16,18 @@ def _document_user_can_projection():
        "collection_id": 1
    }})

+def get_collection_ids_for(document_ids) -> set:
+    if isinstance(document_ids, str):
+        document_ids = [document_ids]
+    # ideally we would use some "unique" or "distinct" feature here but eve doesn't seem to have it
+    return set(item["collection_id"] for item in service.get_items("documents", params=service.params({
+        "where": {
+            "_id": {"$in": list(document_ids)}
+        }, "projection": {
+            "collection_id": 1
+        }
+    }))) 
+
 def user_can_annotate(document):
    return collections.user_can_annotate_by_id(document["collection_id"])

@@ -28,6 +41,9 @@ def user_can_annotate_by_id(document_id):
    document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
    return user_can_annotate(document)

+def user_can_annotate_by_ids(document_ids):
+    return collections.user_can_annotate_by_ids(get_collection_ids_for(document_ids))
+
 def user_can_view_by_id(document_id):
    document = service.get_item_by_id("documents", document_id, params=_document_user_can_projection())
    return user_can_view(document)
@@ -63,34 +79,147 @@ def get_documents_in_collection(col_id, page):
        "collection_id": col_id
    })
    if truncate:
-        params["projection"] = json.dumps({"metadata": 0})
-        params["truncate"] = truncate_length
+        if truncate_length == 0:
+            params["projection"] = json.dumps({
+                "metadata": 0,
+                "text": 0
+            })
+        else:
+            params["projection"] = json.dumps({
+                "metadata": 0
+            })
+            params["truncate"] = truncate_length

    if page == "all":
        return jsonify(service.get_all_using_pagination("documents", params))
-    
+
    if page: params["page"] = page
    resp = service.get("documents", params = params)
    if not resp.ok:
        abort(resp.status_code, resp.content)
    data = resp.json()
-    if truncate:
+    if truncate and truncate_length != 0:
        for document in data["_items"]:
            document["text"] = document["text"][0:truncate_length]
    return jsonify(data)

+def _check_documents(documents) -> dict:
+    collection_ids = set()
+    for document in documents:
+        if not isinstance(document, dict) or "collection_id" not in document or not document["collection_id"]:
+            raise exceptions.BadRequest()
+        collection_ids.add(document["collection_id"])
+    collections_by_id = {}
+    for collection_id in collection_ids:
+        collection = service.get_item_by_id("collections", collection_id, params=service.params({
+            "projection": {
+                "creator_id": 1,
+                "annotators": 1,
+                "viewers": 1
+            }
+        }))
+        if not collections.user_can_add_documents_or_images(collection):
+            raise exceptions.Unauthorized()
+        collections_by_id[collection_id] = collection
+    return collections_by_id
+
@bp.route("/", strict_slashes = False, methods = ["POST"])
@auth.login_required
 def add_document():
-    document = request.get_json()
-    if not document or "collection_id" not in document or not document["collection_id"]:
+    docs = request.get_json()
+    if not docs or (not isinstance(docs, dict) and not isinstance(docs, list) and not isinstance(docs, tuple)):
        raise exceptions.BadRequest()
-    if not collections.user_can_add_documents_or_images_by_id(document["collection_id"]):
-        raise exceptions.Unauthorized()
-    resp = service.post("documents", json=document)
-    if resp.ok:
-        log.access_flask_add_document(resp.json())
-    return service.convert_response(resp)
+    collections_by_id = _check_documents(docs if isinstance(docs, list) else [docs])
+
+    # Get overlap information stored in related classifier db object, and assign overlap for added document
+    collection_classifiers = {}
+    for doc in (docs if isinstance(docs, list) else [docs]):
+        # get classifier overlaps
+        if doc["collection_id"] not in collection_classifiers:
+            params = service.params({
+                "where": {
+                    "collection_id": doc["collection_id"]
+                }, "projection": {
+                    "overlap": 1
+                }
+            })
+            resp = service.get("classifiers", params=params)
+            if not resp.ok:
+                abort(resp.status_code)
+            classifier_obj = resp.json()["_items"]
+            if len(classifier_obj) != 1:
+                raise exceptions.BadRequest()
+            collection_classifiers[doc["collection_id"]] = classifier_obj[0]
+
+        classifier = collection_classifiers[doc["collection_id"]]
+        overlap = classifier["overlap"]
+        doc["overlap"] = 1 if random.random() < overlap else 0
+
+        # initialize has_annotated dict
+        if "has_annotated" not in doc:
+            doc["has_annotated"] = {user_id: False for user_id in collections_by_id[doc["collection_id"]]["annotators"]}
+
+    # Add document(s) to database
+    doc_resp = service.post("documents", json=docs)
+    if doc_resp.ok:
+        if isinstance(docs, dict):
+            log.access_flask_add_document(doc_resp.json())
+        else:
+            log.access_flask_add_documents(doc_resp.json()["_items"])
+    else:
+        abort(doc_resp.status_code)
+
+    if isinstance(docs, dict):
+        docs = [docs]
+        doc_ids = [doc_resp.json()["_id"]]
+    else:
+        doc_ids = [d["_id"] for d in doc_resp.json()["_items"]]
+    
+    # Update next instances for added documents
+    classifier_next_instances = {}
+    for (i, document) in enumerate(docs):
+        doc_id = doc_ids[i]
+
+        classifier = collection_classifiers[document["collection_id"]]
+        classifier_id = classifier["_id"]
+        
+        if classifier_id not in classifier_next_instances:
+            # Get next_instances to which we'll add document
+            next_instances_params = service.params({
+                "where": {
+                    "classifier_id": classifier_id
+                }, "projection": {
+                    "overlap_document_ids": 1,
+                    "document_ids": 1
+                }
+            })
+            resp = service.get("next_instances", params=next_instances_params)
+            if not resp.ok:
+                abort(resp.status_code)
+            next_instances_obj = resp.json()["_items"]
+            if len(next_instances_obj) != 1:
+                raise exceptions.BadRequest()
+            classifier_next_instances[classifier_id] = next_instances_obj[0]
+
+        next_instances = classifier_next_instances[classifier_id]
+
+        if document["overlap"] == 1:
+            # Add document to overlap IDs for each annotator if it's an overlap document
+            for annotator in next_instances["overlap_document_ids"]:
+                next_instances["overlap_document_ids"][annotator].append(doc_id)
+        else:
+            # Add document to document_ids if it's not an overlap document
+            next_instances["document_ids"].append(doc_id)
+
+    # Patch next_instances with new documents
+    for next_instances in classifier_next_instances.values():
+        headers = {"If-Match": next_instances["_etag"]}
+        service.remove_nonupdatable_fields(next_instances)
+        resp = service.patch(["next_instances", next_instances["_id"]], json=next_instances, headers=headers)
+        if not resp.ok:
+            raise exceptions.BadRequest()
+
+    return service.convert_response(doc_resp)

@bp.route("/can_annotate/<doc_id>", methods = ["GET"])
@auth.login_required
--- a/backend/pine/backend/log.py
+++ b/backend/pine/backend/log.py
@@ -1,7 +1,10 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
 import enum
 import json
 import logging.config
 import os
+import typing

 # make sure this package has been installed
 import pythonjsonlogger
@@ -17,7 +20,9 @@ class Action(enum.Enum):
    CREATE_COLLECTION = enum.auto()
    VIEW_DOCUMENT = enum.auto()
    ADD_DOCUMENT = enum.auto()
+    ADD_DOCUMENTS = enum.auto()
    ANNOTATE_DOCUMENT = enum.auto()
+    ANNOTATE_DOCUMENTS = enum.auto()

 def setup_logging():
    if CONFIG_FILE_ENV not in os.environ:
@@ -51,10 +56,10 @@ def get_flask_logged_in_user():
 def access_flask_login():
    access(Action.LOGIN, get_flask_logged_in_user(), get_flask_request_info(), None)

-def access_flask_logout(user):
+def access_flask_logout(user: dict):
    access(Action.LOGOUT, {"id": user["id"], "username": user["username"]}, get_flask_request_info(), None)

-def access_flask_add_collection(collection):
+def access_flask_add_collection(collection: dict):
    extra_info = {
        "collection_id": collection["_id"]
    }
@@ -65,7 +70,7 @@ def access_flask_add_collection(collection):
                del extra_info["collection_metadata"][k]
    access(Action.CREATE_COLLECTION, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)

-def access_flask_view_document(document):
+def access_flask_view_document(document: dict):
    extra_info = {
        "document_id": document["_id"]
    }
@@ -73,7 +78,7 @@ def access_flask_view_document(document):
        extra_info["document_metadata"] = document["metadata"]
    access(Action.VIEW_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)

-def access_flask_add_document(document):
+def access_flask_add_document(document: dict):
    extra_info = {
        "document_id": document["_id"]
    }
@@ -81,15 +86,31 @@ def access_flask_add_document(document):
        extra_info["document_metadata"] = document["metadata"]
    access(Action.ADD_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)

-def access_flask_annotate_document(document, annotation):
+def access_flask_add_documents(documents: typing.List[dict]):
+    doc_info = []
+    for document in documents:
+        i = {"id": document["_id"]}
+        if "metadata" in document: i["metadata"] = document["metadata"]
+        doc_info.append(i)
    extra_info = {
-        "document_id": document["_id"],
+        "documents": doc_info
+    }
+    access(Action.ADD_DOCUMENTS, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
+
+def access_flask_annotate_document(annotation):
+    extra_info = {
+        "document_id": annotation["document_id"],
        "annotation_id": annotation["_id"]
    }
-    if "metadata" in document:
-        extra_info["document_metadata"] = document["metadata"]
    access(Action.ANNOTATE_DOCUMENT, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)

+def access_flask_annotate_documents(annotations: typing.List[dict]):
+    extra_info = {
+        "document_ids": [annotation["document_id"] for annotation in annotations],
+        "annotation_ids": [annotation["_id"] for annotation in annotations]
+    }
+    access(Action.ANNOTATE_DOCUMENTS, get_flask_logged_in_user(), get_flask_request_info(), None, **extra_info)
+
 ###############

 def access(action, user, request_info, message, **extra_info):
--- a/backend/pine/backend/pineiaa/bratiaa/iaa_service.py
+++ b/backend/pine/backend/pineiaa/bratiaa/iaa_service.py
@@ -67,7 +67,7 @@ def fix_num_for_json(number):

 def getIAAReportForCollection(collection_id):

-    combined = get_doc_annotations(collection_id) ## exclude=set(['bchee1'])
+    combined = get_doc_annotations(collection_id)

    labels = set()
    for v in combined.values():
--- a/backend/pine/backend/pipelines/bp.py
+++ b/backend/pine/backend/pipelines/bp.py
@@ -66,16 +66,6 @@ def _get_classifier_metrics(classifier_id):
    logger.info(all_metrics)
    return all_metrics

-def _get_collection_classifier(collection_id):
-    where = {
-        "collection_id": collection_id
-    }
-    classifiers = service.get_items("/classifiers", params=service.where_params(where))
-    if len(classifiers) != 1:
-        raise exceptions.BadRequest(description="Expected one classifier but found {}.".format(len(classifiers)))
-    return classifiers[0]
-
-
@bp.route("/metrics", methods=["GET"])
@auth.login_required
 def get_metrics():
@@ -140,6 +130,8 @@ def get_next_by_classifier(classifier_id):
        return jsonify(instance["overlap_document_ids"][user_id].pop())
    elif len(instance["document_ids"]) > 0:
        return jsonify(instance["document_ids"].pop())
+    elif len(instance["overlap_document_ids"][user_id]) > 0:
+        return jsonify(instance["overlap_document_ids"][user_id].pop())
    else:
        return jsonify(None)

--- a/client/Pipfile
+++ b/client/Pipfile
@@ -0,0 +1,18 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+[[source]]
+url = "https://pypi.org/simple"
+verify_ssl = true
+name = "pypi"
+
+[packages]
+pymongo = "*"
+requests = "*"
+overrides = "*"
+python-json-logger = "*"
+bcrypt = "*"
+
+[dev-packages]
+
+[requires]
+python_version = "3.6"
--- a/client/Pipfile.lock
+++ b/client/Pipfile.lock
@@ -0,0 +1,201 @@
+{
+    "_meta": {
+        "hash": {
+            "sha256": "a97ae1c4a3394a19df62fcd5603bd637df6dbdda51e75c91bc5594cb1a68ac48"
+        },
+        "pipfile-spec": 6,
+        "requires": {
+            "python_version": "3.6"
+        },
+        "sources": [
+            {
+                "name": "pypi",
+                "url": "https://pypi.org/simple",
+                "verify_ssl": true
+            }
+        ]
+    },
+    "default": {
+        "bcrypt": {
+            "hashes": [
+                "sha256:0258f143f3de96b7c14f762c770f5fc56ccd72f8a1857a451c1cd9a655d9ac89",
+                "sha256:0b0069c752ec14172c5f78208f1863d7ad6755a6fae6fe76ec2c80d13be41e42",
+                "sha256:19a4b72a6ae5bb467fea018b825f0a7d917789bcfe893e53f15c92805d187294",
+                "sha256:5432dd7b34107ae8ed6c10a71b4397f1c853bd39a4d6ffa7e35f40584cffd161",
+                "sha256:6305557019906466fc42dbc53b46da004e72fd7a551c044a827e572c82191752",
+                "sha256:69361315039878c0680be456640f8705d76cb4a3a3fe1e057e0f261b74be4b31",
+                "sha256:6fe49a60b25b584e2f4ef175b29d3a83ba63b3a4df1b4c0605b826668d1b6be5",
+                "sha256:74a015102e877d0ccd02cdeaa18b32aa7273746914a6c5d0456dd442cb65b99c",
+                "sha256:763669a367869786bb4c8fcf731f4175775a5b43f070f50f46f0b59da45375d0",
+                "sha256:8b10acde4e1919d6015e1df86d4c217d3b5b01bb7744c36113ea43d529e1c3de",
+                "sha256:9fe92406c857409b70a38729dbdf6578caf9228de0aef5bc44f859ffe971a39e",
+                "sha256:a190f2a5dbbdbff4b74e3103cef44344bc30e61255beb27310e2aec407766052",
+                "sha256:a595c12c618119255c90deb4b046e1ca3bcfad64667c43d1166f2b04bc72db09",
+                "sha256:c9457fa5c121e94a58d6505cadca8bed1c64444b83b3204928a866ca2e599105",
+                "sha256:cb93f6b2ab0f6853550b74e051d297c27a638719753eb9ff66d1e4072be67133",
+                "sha256:ce4e4f0deb51d38b1611a27f330426154f2980e66582dc5f438aad38b5f24fc1",
+                "sha256:d7bdc26475679dd073ba0ed2766445bb5b20ca4793ca0db32b399dccc6bc84b7",
+                "sha256:ff032765bb8716d9387fd5376d987a937254b0619eff0972779515b5c98820bc"
+            ],
+            "index": "pypi",
+            "version": "==3.1.7"
+        },
+        "certifi": {
+            "hashes": [
+                "sha256:5930595817496dd21bb8dc35dad090f1c2cd0adfaf21204bf6732ca5d8ee34d3",
+                "sha256:8fc0819f1f30ba15bdb34cceffb9ef04d99f420f68eb75d901e9560b8749fc41"
+            ],
+            "version": "==2020.6.20"
+        },
+        "cffi": {
+            "hashes": [
+                "sha256:001bf3242a1bb04d985d63e138230802c6c8d4db3668fb545fb5005ddf5bb5ff",
+                "sha256:00789914be39dffba161cfc5be31b55775de5ba2235fe49aa28c148236c4e06b",
+                "sha256:028a579fc9aed3af38f4892bdcc7390508adabc30c6af4a6e4f611b0c680e6ac",
+                "sha256:14491a910663bf9f13ddf2bc8f60562d6bc5315c1f09c704937ef17293fb85b0",
+                "sha256:1cae98a7054b5c9391eb3249b86e0e99ab1e02bb0cc0575da191aedadbdf4384",
+                "sha256:2089ed025da3919d2e75a4d963d008330c96751127dd6f73c8dc0c65041b4c26",
+                "sha256:2d384f4a127a15ba701207f7639d94106693b6cd64173d6c8988e2c25f3ac2b6",
+                "sha256:337d448e5a725bba2d8293c48d9353fc68d0e9e4088d62a9571def317797522b",
+                "sha256:399aed636c7d3749bbed55bc907c3288cb43c65c4389964ad5ff849b6370603e",
+                "sha256:3b911c2dbd4f423b4c4fcca138cadde747abdb20d196c4a48708b8a2d32b16dd",
+                "sha256:3d311bcc4a41408cf5854f06ef2c5cab88f9fded37a3b95936c9879c1640d4c2",
+                "sha256:62ae9af2d069ea2698bf536dcfe1e4eed9090211dbaafeeedf5cb6c41b352f66",
+                "sha256:66e41db66b47d0d8672d8ed2708ba91b2f2524ece3dee48b5dfb36be8c2f21dc",
+                "sha256:675686925a9fb403edba0114db74e741d8181683dcf216be697d208857e04ca8",
+                "sha256:7e63cbcf2429a8dbfe48dcc2322d5f2220b77b2e17b7ba023d6166d84655da55",
+                "sha256:8a6c688fefb4e1cd56feb6c511984a6c4f7ec7d2a1ff31a10254f3c817054ae4",
+                "sha256:8c0ffc886aea5df6a1762d0019e9cb05f825d0eec1f520c51be9d198701daee5",
+                "sha256:95cd16d3dee553f882540c1ffe331d085c9e629499ceadfbda4d4fde635f4b7d",
+                "sha256:99f748a7e71ff382613b4e1acc0ac83bf7ad167fb3802e35e90d9763daba4d78",
+                "sha256:b8c78301cefcf5fd914aad35d3c04c2b21ce8629b5e4f4e45ae6812e461910fa",
+                "sha256:c420917b188a5582a56d8b93bdd8e0f6eca08c84ff623a4c16e809152cd35793",
+                "sha256:c43866529f2f06fe0edc6246eb4faa34f03fe88b64a0a9a942561c8e22f4b71f",
+                "sha256:cab50b8c2250b46fe738c77dbd25ce017d5e6fb35d3407606e7a4180656a5a6a",
+                "sha256:cef128cb4d5e0b3493f058f10ce32365972c554572ff821e175dbc6f8ff6924f",
+                "sha256:cf16e3cf6c0a5fdd9bc10c21687e19d29ad1fe863372b5543deaec1039581a30",
+                "sha256:e56c744aa6ff427a607763346e4170629caf7e48ead6921745986db3692f987f",
+                "sha256:e577934fc5f8779c554639376beeaa5657d54349096ef24abe8c74c5d9c117c3",
+                "sha256:f2b0fa0c01d8a0c7483afd9f31d7ecf2d71760ca24499c8697aeb5ca37dc090c"
+            ],
+            "version": "==1.14.0"
+        },
+        "chardet": {
+            "hashes": [
+                "sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
+                "sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
+            ],
+            "version": "==3.0.4"
+        },
+        "idna": {
+            "hashes": [
+                "sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6",
+                "sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0"
+            ],
+            "version": "==2.10"
+        },
+        "overrides": {
+            "hashes": [
+                "sha256:30f761124579e59884b018758c4d7794914ef02a6c038621123fec49ea7599c6"
+            ],
+            "index": "pypi",
+            "version": "==3.1.0"
+        },
+        "pycparser": {
+            "hashes": [
+                "sha256:2d475327684562c3a96cc71adf7dc8c4f0565175cf86b6d7a404ff4c771f15f0",
+                "sha256:7582ad22678f0fcd81102833f60ef8d0e57288b6b5fb00323d101be910e35705"
+            ],
+            "version": "==2.20"
+        },
+        "pymongo": {
+            "hashes": [
+                "sha256:01b4e10027aef5bb9ecefbc26f5df3368ce34aef81df43850f701e716e3fe16d",
+                "sha256:0fc5aa1b1acf7f61af46fe0414e6a4d0c234b339db4c03a63da48599acf1cbfc",
+                "sha256:1396eb7151e0558b1f817e4b9d7697d5599e5c40d839a9f7270bd90af994ad82",
+                "sha256:18e84a3ec5e73adcb4187b8e5541b2ad61d716026ed9863267e650300d8bea33",
+                "sha256:19adf2848b80cb349b9891cc854581bbf24c338be9a3260e73159bdeb2264464",
+                "sha256:20ee0475aa2ba437b0a14806f125d696f90a8433d820fb558fdd6f052acde103",
+                "sha256:26798795097bdeb571f13942beef7e0b60125397811c75b7aa9214d89880dd1d",
+                "sha256:26e707a4eb851ec27bb969b5f1413b9b2eac28fe34271fa72329100317ea7c73",
+                "sha256:2a3c7ad01553b27ec553688a1e6445e7f40355fb37d925c11fcb50b504e367f8",
+                "sha256:2f07b27dbf303ea53f4147a7922ce91a26b34a0011131471d8aaf73151fdee9a",
+                "sha256:316f0cf543013d0c085e15a2c8abe0db70f93c9722c0f99b6f3318ff69477d70",
+                "sha256:31d11a600eea0c60de22c8bdcb58cda63c762891facdcb74248c36713240987f",
+                "sha256:334ef3ffd0df87ea83a0054454336159f8ad9c1b389e19c0032d9cb8410660e6",
+                "sha256:358ba4693c01022d507b96a980ded855a32dbdccc3c9331d0667be5e967f30ed",
+                "sha256:3a6568bc53103df260f5c7d2da36dffc5202b9a36c85540bba1836a774943794",
+                "sha256:444bf2f44264578c4085bb04493bfed0e5c1b4fe7c2704504d769f955cc78fe4",
+                "sha256:47a00b22c52ee59dffc2aad02d0bbfb20c26ec5b8de8900492bf13ad6901cf35",
+                "sha256:4c067db43b331fc709080d441cb2e157114fec60749667d12186cc3fc8e7a951",
+                "sha256:4c092310f804a5d45a1bcaa4191d6d016c457b6ed3982a622c35f729ff1c7f6b",
+                "sha256:53b711b33134e292ef8499835a3df10909c58df53a2a0308f598c432e9a62892",
+                "sha256:568d6bee70652d8a5af1cd3eec48b4ca1696fb1773b80719ebbd2925b72cb8f6",
+                "sha256:56fa55032782b7f8e0bf6956420d11e2d4e9860598dfe9c504edec53af0fc372",
+                "sha256:5a2c492680c61b440272341294172fa3b3751797b1ab983533a770e4fb0a67ac",
+                "sha256:61235cc39b5b2f593086d1d38f3fc130b2d125bd8fc8621d35bc5b6bdeb92bd2",
+                "sha256:619ac9aaf681434b4d4718d1b31aa2f0fce64f2b3f8435688fcbdc0c818b6c54",
+                "sha256:6238ac1f483494011abde5286282afdfacd8926659e222ba9b74c67008d3a58c",
+                "sha256:63752a72ca4d4e1386278bd43d14232f51718b409e7ac86bcf8810826b531113",
+                "sha256:6fdc5ccb43864065d40dd838437952e9e3da9821b7eac605ba46ada77f846bdf",
+                "sha256:7abc3a6825a346fa4621a6f63e3b662bbb9e0f6ffc32d30a459d695f20fb1a8b",
+                "sha256:7aef381bb9ae8a3821abd7f9d4d93978dbd99072b48522e181baeffcd95b56ae",
+                "sha256:80df3caf251fe61a3f0c9614adc6e2bfcffd1cd3345280896766712fb4b4d6d7",
+                "sha256:95f970f34b59987dee6f360d2e7d30e181d58957b85dff929eee4423739bd151",
+                "sha256:993257f6ca3cde55332af1f62af3e04ca89ce63c08b56a387cdd46136c72f2fa",
+                "sha256:9c0a57390549affc2b5dda24a38de03a5c7cbc58750cd161ff5d106c3c6eec80",
+                "sha256:a0794e987d55d2f719cc95fcf980fc62d12b80e287e6a761c4be14c60bd9fecc",
+                "sha256:a3b98121e68bf370dd8ea09df67e916f93ea95b52fc010902312168c4d1aff5d",
+                "sha256:a60756d55f0887023b3899e6c2923ba5f0042fb11b1d17810b4e07395404f33e",
+                "sha256:a676bd2fbc2309092b9bbb0083d35718b5420af3a42135ebb1e4c3633f56604d",
+                "sha256:a732838c78554c1257ff2492f5c8c4c7312d0aecd7f732149e255f3749edd5ee",
+                "sha256:ae65d65fde4135ef423a2608587c9ef585a3551fc2e4e431e7c7e527047581be",
+                "sha256:b070a4f064a9edb70f921bfdc270725cff7a78c22036dd37a767c51393fb956f",
+                "sha256:b6da85949aa91e9f8c521681344bd2e163de894a5492337fba8b05c409225a4f",
+                "sha256:bbf47110765b2a999803a7de457567389253f8670f7daafb98e059c899ce9764",
+                "sha256:c06b3f998d2d7160db58db69adfb807d2ec307e883e2f17f6b87a1ef6c723f11",
+                "sha256:c318fb70542be16d3d4063cde6010b1e4d328993a793529c15a619251f517c39",
+                "sha256:c4aef42e5fa4c9d5a99f751fb79caa880dac7eaf8a65121549318b984676a1b7",
+                "sha256:c9ca545e93a9c2a3bdaa2e6e21f7a43267ff0813e8055adf2b591c13164c0c57",
+                "sha256:da2c3220eb55c4239dd8b982e213da0b79023cac59fe54ca09365f2bc7e4ad32",
+                "sha256:dd8055da300535eefd446b30995c0813cc4394873c9509323762a93e97c04c03",
+                "sha256:e2b46e092ea54b732d98c476720386ff2ccd126de1e52076b470b117bff7e409",
+                "sha256:e334c4f39a2863a239d38b5829e442a87f241a92da9941861ee6ec5d6380b7fe",
+                "sha256:e5c54f04ca42bbb5153aec5d4f2e3d9f81e316945220ac318abd4083308143f5",
+                "sha256:f96333f9d2517c752c20a35ff95de5fc2763ac8cdb1653df0f6f45d281620606"
+            ],
+            "index": "pypi",
+            "version": "==3.10.1"
+        },
+        "python-json-logger": {
+            "hashes": [
+                "sha256:b7a31162f2a01965a5efb94453ce69230ed208468b0bbc7fdfc56e6d8df2e281"
+            ],
+            "index": "pypi",
+            "version": "==0.1.11"
+        },
+        "requests": {
+            "hashes": [
+                "sha256:b3559a131db72c33ee969480840fff4bb6dd111de7dd27c8ee1f820f4f00231b",
+                "sha256:fe75cc94a9443b9246fc7049224f75604b113c36acb93f87b80ed42c44cbb898"
+            ],
+            "index": "pypi",
+            "version": "==2.24.0"
+        },
+        "six": {
+            "hashes": [
+                "sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259",
+                "sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"
+            ],
+            "version": "==1.15.0"
+        },
+        "urllib3": {
+            "hashes": [
+                "sha256:3018294ebefce6572a474f0604c2021e33b3fd8006ecd11d62107a5d2a963527",
+                "sha256:88206b0eb87e6d677d424843ac5209e3fb9d0190d0ee169599165ec25e9d9115"
+            ],
+            "version": "==1.25.9"
+        }
+    },
+    "develop": {}
+}
--- a/client/README.md
+++ b/client/README.md
@@ -0,0 +1,14 @@
+&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+## Developer Environment
+
+Required packages:
+* python 3.6
+* pipenv
+
+`pipenv install --dev`
+
+### Running client interactively
+
+`./interactive_dev_run.sh` will connect the client using the dev stack values and then drop into
+an interactive shell to allow interaction with the client.
--- a/client/interactive_dev_run.sh
+++ b/client/interactive_dev_run.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
+
+BACKEND_PORT=${BACKEND_PORT:-5000}
+EVE_PORT=${EVE_PORT:-5001}
+MONGO_PORT=${MONGO_PORT:-27018}
+
+pushd ${DIR} &> /dev/null
+
+read -r -d '' CODE << EOF
+import code;
+import sys;
+
+sys.path.append("${DIR}");
+import pine.client;
+pine.client.setup_logging();
+client = pine.client.LocalPineClient("http://localhost:${BACKEND_PORT}", "http://localhost:${EVE_PORT}", "mongodb://localhost:${MONGO_PORT}");
+code.interact(banner="",exitmsg="",local=locals())
+EOF
+
+echo "${CODE}"
+echo ""
+PINE_LOGGING_CONFIG_FILE=$(realpath ${DIR}/../shared/logging.python.dev.json) pipenv run python3 -c "${CODE}"
+
+popd &> /dev/null
--- a/client/interactive_docker_run.sh
+++ b/client/interactive_docker_run.sh
@@ -0,0 +1,20 @@
+#!/bin/bash
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
+source ${DIR}/../.env
+
+export BACKEND_PORT
+export EVE_PORT
+export MONGO_PORT
+
+if ! wget http://localhost:${BACKEND_PORT}/ping -O /dev/null -o /dev/null || ! wget http://localhost:${EVE_PORT} -O /dev/null -o /dev/null || ! wget http://localhost:${MONGO_PORT} -O /dev/null -o /dev/null; then
+    echo "Use docker-compose.test.yml when running docker compose stack."
+    exit 1
+fi
+
+pushd ${DIR} &> /dev/null
+
+./interactive_dev_run.sh
+
+popd &> /dev/null
--- a/client/pine/init.py
+++ b/client/pine/init.py
@@ -0,0 +1,4 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+"""PINE main module.
+"""
--- a/client/pine/client/init.py
+++ b/client/pine/client/init.py
@@ -0,0 +1,8 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+"""PINE client module.
+"""
+
+from .client import PineClient, LocalPineClient
+from .log import setup_logging
+from .models import CollectionBuilder
--- a/client/pine/client/client.py
+++ b/client/pine/client/client.py
@@ -0,0 +1,737 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+"""PINE client classes module.
+"""
+
+import abc
+import json
+import logging
+import typing
+
+from overrides import overrides
+import pymongo
+import requests
+
+from . import exceptions, models, password
+
+def _standardize_path(path: str, *additional_paths: typing.List[str]) -> typing.List[str]:
+    r"""Standardize path(s) into a list of path components.
+    
+    :param path: relative path, e.g. ``"users"``
+    :type path: str
+    :param \*additional_paths: any additional path components in a list
+    :type \*additional_paths: list(str), optional
+    
+    :return: the standardized path components in a list
+    :rtype: list(str)
+    """
+    # if you change this, also update backend module pine.backend.data.service
+    if type(path) not in [list, tuple, set]:
+        path = [path]
+    if additional_paths:
+        path += additional_paths
+    # for every element in path, split by "/" into a list of paths, then remove empty values
+    # "/test" => ["test"], ["/test", "1"] => ["test", "1"], etc.
+    return [single_path for subpath in path for single_path in subpath.split("/") if single_path]
+
+class BaseClient(object):
+    """Base class for a client using a REST interface.
+    """
+
+    __metaclass__ = abc.ABCMeta
+
+    def __init__(self, base_uri: str, name: str = None):
+        """Constructor.
+        
+        :param base_uri: the base URI for the server, e.g. ``"http://localhost:5000"``
+        :type base_uri: str
+        :param name: optional human-readable name for the server, defaults to None
+        :type name: str, optional
+        """
+        self.base_uri: str = base_uri.strip("/")
+        """The server's base URI.
+        
+        :type: str
+        """
+        self.session: requests.Session = None
+        """The currently open session, or ``None``.
+        
+        :type: requests.Session
+        """
+        self.name: str = name
+        self.logger: logging.Logger = logging.getLogger(self.__class__.__name__)
+
+    @abc.abstractmethod
+    def is_valid(self) -> bool:
+        """Returns whether this client and its connection(s) are valid.
+        
+        :return: whether this client and its connection(s) are valid
+        :rtype: bool
+        """
+        raise NotImplementedError()
+
+    def uri(self, path: str, *additional_paths: typing.List[str]) -> str:
+        r"""Makes a complete URI from the given path(s).
+        
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        
+        :return: the complete, standardized URI including the base URI, e.g. ``"http://localhost:5000/users"``
+        :rtype: str
+        """
+        return "/".join([self.base_uri] + _standardize_path(path, *additional_paths))
+
+    def _req(self, method: str, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
+        r"""Makes a :py:mod:`requests` call, checks for errors, and returns the response.
+        
+        :param method: REST method (``"get"``, ``"post"``, etc.)
+        :type method: str
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        :param \**kwargs: any additional kwargs to send to :py:mod:`requests`
+        :type \**kwargs: dict
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :return: the :py:mod:`requests` :py:class:`requests.Response` object
+        :rtype: requests.Response
+        """
+        uri = self.uri(path, *additional_paths)
+        self.logger.debug("{} {}".format(method.upper(), uri))
+        if self.session:
+            resp = self.session.request(method, uri, **kwargs)
+        else:
+            resp = requests.request(method, uri, **kwargs)
+        if not resp.ok:
+            uri = "\"/" + "/".join(_standardize_path(path, *additional_paths)) + "\""
+            raise exceptions.PineClientHttpException("{}".format(method.upper()),
+                                                 "{} {}".format(self.name, uri) if self.name else uri,
+                                                 resp)
+        return resp
+
+    def get(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
+        r"""Makes a :py:mod:`requests` ``GET`` call, checks for errors, and returns the response.
+        
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        :param \**kwargs: any additional kwargs to send to :py:mod:`requests`
+        :type \**kwargs: dict
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
+        :rtype: requests.Response
+        """
+        return self._req("GET", path, *additional_paths, **kwargs)
+
+    def put(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
+        r"""Makes a :py:mod:`requests` ``PUT`` call, checks for errors, and returns the response.
+        
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        :param \**kwargs: any additional kwargs to send to :py:mod:`requests`
+        :type \**kwargs: dict
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
+        :rtype: requests.Response
+        """
+        return self._req("PUT", path, *additional_paths, **kwargs)
+
+    def patch(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
+        r"""Makes a :py:mod:`requests` ``PATCH`` call, checks for errors, and returns the response.
+        
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        :param \**kwargs: any additional kwargs to send to :py:mod:`requests`
+        :type \**kwargs: dict
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
+        :rtype: requests.Response
+        """
+        return self._req("PATCH", path, *additional_paths, **kwargs)
+
+    def post(self, path: str, *additional_paths: typing.List[str], **kwargs) -> requests.Response:
+        r"""Makes a :py:mod:`requests` ``POST`` call, checks for errors, and returns the response.
+        
+        :param path: relative path, e.g. ``"users"``
+        :type path: str
+        :param \*additional_paths: any additional path components
+        :type \*additional_paths: list(str), optional
+        :param \**kwargs: any additional kwargs to send to :py:mod:`requests`
+        :type \**kwargs: dict
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :return: the :py:mod:`requests` :py:class:`Response <requests.Response>` object
+        :rtype: requests.Response
+        """
+        return self._req("POST", path, *additional_paths, **kwargs)
+
+
+class EveClient(BaseClient):
+    """A client to access Eve and, optionally, its underlying MongoDB instance.
+    """
+
+    DEFAULT_DBNAME: str = "pmap_nlp"
+    """The default DB name used by PINE.
+    
+    :type: str
+    """
+
+    def __init__(self, eve_base_uri: str, mongo_base_uri: str = None, mongo_dbname: str = DEFAULT_DBNAME):
+        """Constructor.
+        
+        :param eve_base_uri: the base URI for the eve server, e.g. ``"http://localhost:5001"``
+        :type eve_base_uri: str
+        :param mongo_base_uri: the base URI for the mongodb server, e.g. ``"mongodb://localhost:27018"``, defaults to ``None``
+        :type mongo_base_uri: str, optional
+        :param mongo_dbname: the DB name that PINE uses, defaults to ``"pmap_nlp"``
+        :type mongo_dbname: str, optional
+        """
+        super().__init__(eve_base_uri, name="eve")
+        self.mongo_base_uri: str = mongo_base_uri
+        """The base URI for the MongoDB server.
+        
+        :type: str
+        """
+        self.mongo: pymongo.MongoClient = pymongo.MongoClient(mongo_base_uri) if mongo_base_uri else None
+        """The :py:class:`pymongo.mongo_client.MongoClient` instance.
+        
+        :type: pymongo.mongo_client.MongoClient
+        """
+        self.mongo_db: pymongo.database.Database = self.mongo[mongo_dbname] if self.mongo and mongo_dbname else None
+        """The :py:class:`pymongo.database.Database` instance.
+        
+        :type: pymongo.database.Database
+        """
+
+    @overrides
+    def is_valid(self) -> bool:
+        if self.mongo_base_uri:
+            try:
+                if not pymongo.MongoClient(self.mongo_base_uri, serverSelectionTimeoutMS=1).server_info():
+                    self.logger.error("Unable to connect to MongoDB")
+                    return False
+            except:
+                self.logger.error("Unable to connect to MongoDB", exc_info=True)
+                return False
+        try:
+            self.ping()
+        except:
+            self.logger.error("Unable to ping eve", exc_info=True)
+            return False
+        return True
+
+    def ping(self) -> typing.Any:
+        """Pings the eve server and returns the result.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the JSON response from the server (probably ``"pong"``)
+        """
+        return self.get("system/ping").json()
+
+    def about(self) -> dict:
+        """Returns the 'about' dict from the server.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the JSON response from the server
+        :rtype: dict
+        """
+        return self.get("about").json()
+
+    def get_resource(self, resource: str, resource_id: str) -> dict:
+        """Gets a resource from eve by its ID.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the JSON object response from the server
+        :rtype: dict
+        """
+        return self.get(resource, resource_id).json()
+
+    def _add_or_replace_resource(self, resource: str, obj: dict, valid_fn: typing.Callable[[dict, typing.Callable[[str], None]], bool] = None) -> str:
+        """Adds or replaces the given resource.
+        
+        :param resource: the resource type, e.g. ``"users"``
+        :type resource: str
+        :param obj: the resource object
+        :type obj: dict
+        :param valid_fn: a function to validate the resource object, defaults to ``None``
+        :type valid_fn: function, optional
+        
+        :raises exceptions.PineClientValueException: if a valid_fn is passed in and the object fails
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the ID of the added/replaced resource
+        :rtype: str
+        """
+        if valid_fn and not valid_fn(obj):
+            raise exceptions.PineClientValueException(obj, resource)
+        if models.ID_FIELD in obj:
+            try:
+                res = self.get_resource(resource, obj[models.ID_FIELD])
+            except exceptions.PineClientHttpException as e:
+                if e.resp.status_code == 404:
+                    return self.post(resource, obj).json()[models.ID_FIELD]
+                else:
+                    raise e
+            return self.put(resource, obj[models.ID_FIELD], json=obj, headers={"If-Match": res["_etag"]}).json()[models.ID_FIELD]
+        else:
+            return self.post(resource, obj).json()[models.ID_FIELD]
+
+    def _add_resources(self, resource: str, objs: typing.List[dict], valid_fn: typing.Callable[[dict, typing.Callable[[str], None]], bool] = None, replace_if_exists: bool = False):
+        """Tries to add all the resource objects at once, optionally falling back to individual replacement if that fails.
+        
+        :param resource: the resource type, e.g. ``"users"``
+        :type resource: str
+        :param objs: the resource objects
+        :type objs: list(dict)
+        :param valid_fn: a function to validate the resource object, defaults to ``None``
+        :type valid_fn: function, optional
+        :param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
+        :type replace_if_exists: bool, optional
+        
+        :raises exceptions.PineClientValueException: if a valid_fn is passed in and any of the objects fails
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the IDs of the added resources
+        :rtype: list(str)
+        """
+        if objs == None:
+            return []
+        if valid_fn:
+            for obj in objs:
+                if not valid_fn(obj, self.logger.warn):
+                    raise exceptions.PineClientValueException(obj, resource)
+        try:
+            resp = self.post(resource, json=objs)
+            return [item[models.ID_FIELD] for item in resp.json()[models.ITEMS_FIELD]]
+        except exceptions.PineClientHttpException as e:
+            if e.resp.status_code == 409 and replace_if_exists:
+                return [self.add_or_replace_resource(resource, obj, valid_fn) for obj in objs]
+            else:
+                raise e
+
+    def add_users(self, users: typing.List[dict], replace_if_exists=False) -> typing.List[str]:
+        """Adds the given users.
+        
+        :param users: the user objects
+        :type users: list(dict)
+        :param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
+        :type replace_if_exists: bool, optional
+        
+        :raises exceptions.PineClientValueException: if any of the user objects are not valid, see :py:func:`.models.is_valid_eve_user`
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the IDs of the added users
+        :rtype: list(str)
+        """
+        for user in users:
+            if "password" in user:
+                user["passwdhash"] = password.hash_password(user["password"])
+                del user["password"]
+        return self._add_resources("users", users, valid_fn=models.is_valid_eve_user, replace_if_exists=replace_if_exists)
+
+    def get_users(self):
+        """Gets all users.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: all the users
+        :rtype: list(dict)
+        """
+        return self.get("users").json()[models.ITEMS_FIELD]
+
+    def add_pipelines(self, pipelines: typing.List[dict], replace_if_exists=False) -> typing.List[str]:
+        """Adds the given pipelines.
+        
+        :param pipelines: the pipeline objects
+        :type pipelines: list(dict)
+        :param replace_if_exists: whether to replace the resource with the given value if it already exists on the server, defaults to ``False``
+        :type replace_if_exists: bool, optional
+        
+        :raises exceptions.PineClientValueException: if any of the pipeline objects are not valid, see :py:func:`.models.is_valid_eve_pipeline`
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the IDs of the added pipelines
+        :rtype: list(str)
+        """
+        return self._add_resources("pipelines", pipelines, valid_fn=models.is_valid_eve_pipeline, replace_if_exists=replace_if_exists)
+
+
+class PineClient(BaseClient):
+    """A client to access PINE (more specifically: the backend).
+    """
+
+    def __init__(self, backend_base_uri: str):
+        """Constructor.
+        
+        :param backend_base_uri: the base URI for the backend server, e.g. ``"http://localhost:5000"``
+        :type backend_base_uri: str
+        """
+        super().__init__(backend_base_uri)
+
+    @overrides
+    def is_valid(self) -> bool:
+        try:
+            self.ping()
+            return True
+        except:
+            self.logger.error("Unable to ping PINE backend", exc_info=True)
+            return False
+
+    def ping(self) -> typing.Any:
+        """Pings the backend server and returns the result.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the JSON response from the server (probably ``"pong"``)
+        """
+        return self.get("ping").json()
+
+    def about(self) -> dict:
+        """Returns the 'about' dict from the server.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the JSON response from the server
+        :rtype: dict
+        """
+        return self.get("about").json()
+
+    def get_logged_in_user(self) -> dict:
+        """Returns the currently logged in user, or None if not logged in.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: currently logged in user, or None if not logged in
+        :rtype: dict
+        """
+        return self.get(["auth", "logged_in_user"]).json()
+
+    def get_my_user_id(self) -> str:
+        """Returns the ID of the logged in user, or None if not logged in.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the ID of the logged in user, or None if not logged in
+        :rtype: str
+        """
+        u = self.get_logged_in_user()
+        return u["id"] if u and "id" in u else None
+
+    def is_logged_in(self) -> bool:
+        """Returns whether the user is currently logged in or not.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: whether the user is currently logged in or not
+        :rtype: bool
+        """
+        return self.session and self.get_logged_in_user()
+
+    def _check_login(self):
+        """Checks whether user is logged in and raises an :py:class:`.exceptions.PineClientAuthException` if not.
+        
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        """        
+        if not self.is_logged_in():
+            raise exceptions.PineClientAuthException("User is not logged in.")
+
+    def get_auth_module(self) -> str:
+        """Returns the PINE authentication module, e.g. ``"eve"``.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the PINE authentication module, e.g. ``"eve"``
+        :rtype: str
+        """
+        return self.get(["auth", "module"]).json()
+
+    def login_eve(self, username: str, password: str) -> bool:
+        """Logs in using eve credentials, and returns whether it was successful.
+        
+        :param username: username
+        :type username: str
+        :param password: password
+        :type password: str
+        
+        :raises exceptions.PineClientAuthException: if auth module is not eve or login was not successful
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: whether the login was successful
+        :rtype: bool
+        """
+        if self.get_auth_module() != "eve":
+            raise exceptions.PineClientAuthException("Auth module is not eve.")
+        if self.session:
+            self.logout()
+        self.session = requests.Session()
+        try:
+            self.post(["auth", "login"], json={
+                "username": username,
+                "password": password
+            })
+            return True
+        except exceptions.PineClientHttpException as e:
+            self.session.close()
+            self.session = None
+            if e.resp.status_code == 401:
+                raise exceptions.PineClientAuthException("Login failed for {}".format(username), cause=e)
+            else:
+                raise e
+
+    def logout(self):
+        """Logs out the current user.
+        
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        """
+        if self.is_logged_in():
+            self.post(["auth", "logout"])
+            if self.session:
+                self.session.close()
+                self.session = None
+
+    def get_pipelines(self) -> typing.List[dict]:
+        """Returns all pipelines accessible to logged in user.
+        
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: all pipelines accessible to logged in user
+        :rtype: list(dict)
+        """
+        self._check_login()
+        return self.get("pipelines").json()[models.ITEMS_FIELD]
+
+    def collection_builder(self, **kwargs: dict) -> models.CollectionBuilder:
+        r"""Makes and returns a new :py:class:`.models.CollectionBuilder` with the logged in user.
+        
+        :param \**kwargs: any additional args to pass in to the constructor
+        :type \**kwargs: dict
+        
+        :returns: a new :py:class:`.models.CollectionBuilder` with the logged in user
+        :rtype: models.CollectionBuilder
+        """
+        return models.CollectionBuilder(creator_id=self.get_my_user_id(), **kwargs)
+
+    def create_collection(self, builder: models.CollectionBuilder) -> str:
+        """Creates a collection using the current value of the given builder and returns its ID.
+        
+        :param builder: collection builder
+        :type builder: models.CollectionBuilder
+        
+        :raises exceptions.PineClientValueException: if the given collection is not valid, see :py:func:`.models.is_valid_collection`
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the created collection's ID
+        :rtype: str
+        """
+        self._check_login()
+        if builder == None or not isinstance(builder, models.CollectionBuilder):
+            raise exceptions.PineClientValueException(builder, "CollectionBuilder")
+        if not builder.is_valid(self.logger.warn):
+            raise exceptions.PineClientValueException(builder, "collection")
+        return self.post("collections", data=builder.form_json, files=builder.files).json()[models.ID_FIELD]
+
+    def get_collection_documents(self, collection_id: str, truncate: bool, truncate_length: int = 30) -> typing.List[dict]:
+        """Returns the documents in the given collection.
+        
+        :param collection_id: the ID of the collection
+        :type collection_id: str
+        :param truncate: whether to truncate the document text (a good idea unless you need it)
+        :type truncate: bool
+        :param truncate_length: how many characters of the text you want if truncated, defaults to ``30``
+        :type truncate_length: int, optional
+        
+        :returns: the documents in the given collection
+        :rtype: list(dict)
+        """
+        return self.get(["documents", "by_collection_id", collection_id], params={
+            "truncate": json.dumps(truncate),
+            "truncateLength": json.dumps(truncate_length)
+        }).json()["_items"]
+
+    def add_document(self, document: dict = {}, creator_id: str = None, collection_id: str = None,
+                     overlap: int = None, text: str = None, metadata: dict = None) -> str:
+        """Adds a new document to a collection and returns its ID.
+        
+        Will use the logged in user ID for the creator_id if none is given.  Although all the
+        parameters are optional, you must provide values either in the document or through the
+        kwargs in order to make a valid document.
+        
+        :param document: optional document dict, will be overridden with any kwargs, defaults to ``{}``
+        :type document: dict, optional
+        :param creator_id: optional creator_id for the document, defaults to ``None`` (not set)
+        :type creator_id: str, optional
+        :param collection_id: optional collection_id for the document, defaults to ``None`` (not set)
+        :type collection_id: str, optional
+        :param overlap: optional overlap for the document, defaults to ``None`` (not set)
+        :type overlap: int, optional
+        :param text: optional text for the document, defaults to ``None`` (not set)
+        :type text: str, optional
+        :param metadata: optional metadata for the document, defaults to ``None`` (not set)
+        :type metadata: dict, optional
+        
+        :raises exceptions.PineClientValueException: if the given document parameters are not valid, see :py:func:`.models.is_valid_eve_document`
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the created document's ID
+        :rtype: str
+        """
+        self._check_login()
+        user_id = self.get_my_user_id()
+        if document == None or not isinstance(document, dict):
+            document = {}
+        if creator_id:
+            document["creator_id"] = creator_id
+        elif not "creator_id" in document:
+            document["creator_id"] = user_id
+        if collection_id:
+            document["collection_id"] = collection_id
+        if overlap != None:
+            document["overlap"] = overlap
+        if text:
+            document["text"] = text
+        if metadata != None:
+            document["metadata"] = metadata
+        if not models.is_valid_eve_document(document, self.logger.warn):
+            raise exceptions.PineClientValueException(document, "documents")
+        return self.post("documents", json=document).json()[models.ID_FIELD]
+
+    def add_documents(self, documents: typing.List[dict], creator_id: str = None, collection_id: str = None) -> typing.List[str]:
+        """Adds multiple documents at once and returns their IDs.
+        
+        Will use the logged in user ID for the creator_id if none is given.
+        
+        :param documents: the documents to add
+        :type documents: list(dict)
+        :param creator_id: optional creator_id to set in the documents, defaults to ``None`` (not set)
+        :type creator_id: str, optional
+        :param collection_id: optional collection_id to set in the documents, defaults to ``None`` (not set)
+        :type collection_id: str, optional
+        
+        :raises exceptions.PineClientValueException: if any of the given documents are not valid, see :py:func:`.models.is_valid_eve_document`
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the created documents' IDs
+        :rtype: list(str)
+        """
+        self._check_login()
+        user_id = self.get_my_user_id()
+        if documents == None or (not isinstance(documents, list) and not isinstance(documents, tuple)):
+            raise exceptions.PineClientValueException(documents, "documents")
+        for document in documents:
+            if creator_id:
+                document["creator_id"] = creator_id
+            elif "creator_id" not in document or not document["creator_id"]:
+                document["creator_id"] = user_id
+            if collection_id:
+                document["collection_id"] = collection_id
+            if not models.is_valid_eve_document(document, self.logger.warn):
+                raise exceptions.PineClientValueException(document, "documents")
+        return [doc["_id"] for doc in self.post("documents", json=documents).json()[models.ITEMS_FIELD]]
+
+    def annotate_document(self, document_id: str, doc_annotations: typing.List[str], ner_annotations: typing.List[typing.Union[dict, list, tuple]]) -> str:
+        """Annotates the given document with the given values.
+        
+        :param document_id: the document ID to annotate
+        :type document_id: str
+        :param doc_annotations: document annotations/labels
+        :type doc_annotations: list(str)
+        :param ner_annotations: NER annotations, where each annotation is either a list or a dict
+        :type ner_annotations: list
+        
+        :raises exceptions.PineClientValueException: if any of the given annotations are not valid, see :py:func:`.models.is_valid_annotation`
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the annotation ID
+        :rtype: str
+        """
+        self._check_login()
+        if not document_id or not isinstance(document_id, str):
+            raise exceptions.PineClientValueException(document_id, "str")
+        body = {
+            "doc": doc_annotations,
+            "ner": ner_annotations
+        }
+        if not models.is_valid_annotation(body, self.logger.warn):
+            raise exceptions.PineClientValueException(body, "annotation")
+
+        return self.post(["annotations", "mine", "by_document_id", document_id], json=body).json()
+
+    def annotate_collection_documents(self, collection_id: str, document_annotations: dict, skip_document_updates=False) -> typing.List[str]:
+        """Annotates documents in a collection.
+        
+        :param collection_id: the ID of the collection containing the documents to annotate
+        :type collection_id: str
+        :param document_annotations: a dict containing "ner" list and "doc" list
+        :type document_annotations: dict
+        :param skip_document_updates: whether to skip updating the document "has_annotated" map, defaults to ``False``.
+                                      This should only be ``True`` if you properly set the
+                                      "has_annotated" map when you created the document.
+        :type skip_document_updates: bool
+        
+        :raises exceptions.PineClientValueException: if any of the given annotations are not valid, see :py:func:`.models.is_valid_doc_annotations`
+        :raises exceptions.PineClientAuthException: if not logged in
+        :raises exceptions.PineClientHttpException: if the HTTP request returns an error
+        
+        :returns: the annotation IDs
+        :rtype: list(str)
+        """
+        self._check_login()
+        if not models.is_valid_doc_annotations(document_annotations, self.logger.warn):
+            raise exceptions.PineClientValueException(document_annotations, "document_annotations")
+        return self.post(["annotations", "mine", "by_collection_id", collection_id],
+                         json=document_annotations,
+                         params={"skip_document_updates":json.dumps(skip_document_updates)}).json()
+
+
+class LocalPineClient(PineClient):
+    """A client for a local PINE instance, including an :py:class<.EveClient>.
+    """
+
+    def __init__(self, backend_base_uri: str, eve_base_uri: str, mongo_base_uri: str = None, mongo_dbname: str = EveClient.DEFAULT_DBNAME):
+        """Constructor.
+        
+        :param backend_base_uri: the base URI for the backend server, e.g. ``"http://localhost:5000"``
+        :type backend_base_uri: str
+        :param eve_base_uri: the base URI for the eve server, e.g. ``"http://localhost:5001"``
+        :type eve_base_uri: str
+        :param mongo_base_uri: the base URI for the mongodb server, e.g. ``"mongodb://localhost:27018"``, defaults to ``None``
+        :type mongo_base_uri: str, optional
+        :param mongo_dbname: the DB name that PINE uses, defaults to ``"pmap_nlp"``
+        :type mongo_dbname: str, optional
+        """
+        
+        super().__init__(backend_base_uri)
+        self.eve: EveClient = EveClient(eve_base_uri, mongo_base_uri, mongo_dbname=mongo_dbname)
+        """The local :py:class:`EveClient` instance.
+        
+        :type: EveClient
+        """
+
+    @overrides
+    def is_valid(self) -> bool:
+        return super().is_valid() and self.eve.is_valid()
--- a/client/pine/client/exceptions.py
+++ b/client/pine/client/exceptions.py
@@ -0,0 +1,65 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+"""PINE client exceptions module.
+"""
+
+import requests
+
+class PineClientException(Exception):
+    """Base class for PINE client exceptions.
+    """
+
+    def __init__(self, message: str, cause: Exception = None):
+        """Constructor.
+        
+        :param message: the message
+        :type message: str
+        :param cause: optional cause, defaults to ``None``
+        :type cause: Exception, optional
+        """
+        super().__init__(message)
+        if cause:
+            self.__cause__ = cause
+        self.message = message
+        """The message.
+        
+        :type: str
+        """
+
+class PineClientHttpException(PineClientException):
+    """A PINE client exception caused by an underlying HTTP exception.
+    """
+
+    def __init__(self, method: str, path: str, resp: requests.Response):
+        """Constructor.
+        
+        :param method: the REST method (``"get"``, ``"post"``, etc.)
+        :type method: str
+        :param path: the human-readable path that caused the exception
+        :type path: str
+        :param resp: the :py:class:`Response <requests.Response>` with the error info
+        :type resp: requests.Response
+        """
+        super().__init__("HTTP error with {} to {}: {} ({})".format(method, path, resp.status_code, resp.reason))
+        self.resp = resp
+        """The :py:class:`Response <requests.Response>` with the error info
+        
+        :type: requests.Response
+        """
+
+class PineClientValueException(PineClientException):
+    """A PINE client exception caused by passing invalid data.
+    """
+
+    def __init__(self, obj: dict, obj_type: str):
+        """Constructor.
+        
+        :param obj: the error data
+        :type obj: dict
+        :param obj_type: human-readable type of object
+        :type obj_type: str
+        """
+        super().__init__("Object is not a valid type of {}".format(obj_type))
+
+class PineClientAuthException(PineClientException):
+    pass
--- a/client/pine/client/log.py
+++ b/client/pine/client/log.py
@@ -0,0 +1,28 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+import json
+import logging.config
+import os
+
+# make sure this package has been installed
+import pythonjsonlogger
+
+CONFIG_FILE_ENV: str = "PINE_LOGGING_CONFIG_FILE"
+"""The environment variable that optionally contains the file to use for logging configuration.
+
+:type: str
+"""
+
+def setup_logging():
+    """Sets up logging, if configured to do so.
+    
+    The environment variable named by :py:data:`CONFIG_FILE_ENV` is checked and, if present, is
+    passed to :py:func:`logging.config.dictConfig`.
+    """
+    if CONFIG_FILE_ENV not in os.environ:
+        return
+    file = os.environ[CONFIG_FILE_ENV]
+    if os.path.isfile(file):
+        with open(file, "r") as f:
+            logging.config.dictConfig(json.load(f))
+        logging.getLogger(__name__).info("Set logging configuration from file {}".format(file))
--- a/client/pine/client/models.py
+++ b/client/pine/client/models.py
@@ -0,0 +1,937 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+import io
+import json
+import typing
+
+ID_FIELD: str = "_id"
+"""The field used to store database ID.
+
+:type: str
+"""
+
+ITEMS_FIELD: str = "_items"
+"""The field used to access the items in a multi-item database response.
+
+:type: str
+"""
+
+def _check_field_required_bool(obj: dict, field: str) -> bool:
+    """Checks that the given field is in the object and is a bool.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: whether the given field is in the object and is a bool
+    :rtype: bool
+    """
+    return field in obj and isinstance(obj[field], bool)
+
+def _check_field_int(obj: dict, field: str) -> bool:
+    """Checks that if the given field is in the object, that it is an int.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: if the given field is in the object, that it is an int
+    :rtype: bool
+    """
+    return field not in obj or (obj[field] != None and isinstance(obj[field], int))
+
+def _check_field_required_int(obj: dict, field: str) -> bool:
+    """Checks that the given field is in the object and is an int.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: whether the given field is in the object and is an int
+    :rtype: bool
+    """
+    return field in obj and obj[field] != None and isinstance(obj[field], int)
+
+def _check_field_float(obj: dict, field: str) -> bool:
+    """Checks that if the given field is in the object, that it is a float.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: if the given field is in the object, that it is a float
+    :rtype: bool
+    """
+    return field not in obj or (object[field] != None and (isinstance(obj[field], float) or isinstance(obj[field], int)))
+
+def _check_field_required_float(obj: dict, field: str) -> bool:
+    """Checks that the given field is in the object and is a float.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: whether the given field is in the object and is a float
+    :rtype: bool
+    """
+    return field in obj and obj[field] != None and (isinstance(obj[field], float) or isinstance(obj[field], int))
+
+def _check_field_string(obj: dict, field: str) -> bool:
+    """Checks that if the given field is in the object, that it is a string.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: if the given field is in the object, that it is a string
+    :rtype: bool
+    """
+    return field not in obj or isinstance(obj[field], str)
+
+def _check_field_required_string(obj: dict, field: str) -> bool:
+    """Checks that the given field is in the object and is a string.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: whether the given field is in the object and is a string
+    :rtype: bool
+    """
+    return field in obj and obj[field] != None and isinstance(obj[field], str) and len(obj[field].strip()) != 0
+
+def _check_field_string_list(obj: dict, field: str, min_length: int = 0) -> bool:
+    """Checks that if the given field is in the object, that it is a string list.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    :param min_length: the minimum length of the list (if > 0), defaults to 0
+    :type min_length: int, optional
+    
+    :returns: if the given field is in the object, that it is a string list
+    :rtype: bool
+    """
+    if field not in obj:
+        return True
+    if not isinstance(obj[field], list) and not isinstance(obj[field], tuple):
+        return False
+    if min_length > 0 and len(obj[field]) < min_length:
+        return False
+    for elem in obj[field]:
+        if obj == None or not isinstance(elem, str) or len(elem.strip()) == 0:
+            return False
+    return True
+
+def _check_field_required_string_list(obj: dict, field: str, min_length: int = 0) -> bool:
+    """Checks that the given field is in the object and is a string list.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    :param min_length: the minimum length of the list (if > 0), defaults to 0
+    :type min_length: int, optional
+    
+    :returns: if the given field is in the object, that it is a string list
+    :rtype: bool
+    """
+    if field not in obj or obj[field] == None or (not isinstance(obj[field], list) and not isinstance(obj[field], tuple)):
+        return False
+    if min_length > 0 and len(obj[field]) < min_length:
+        return False
+    for elem in obj[field]:
+        if obj == None or not isinstance(elem, str) or len(elem.strip()) == 0:
+            return False
+    return True
+
+def _check_field_dict(obj: dict, field: str) -> bool:
+    """Checks that if the given field is in the object, that it is a dict.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: if the given field is in the object, that it is a dict
+    :rtype: bool
+    """
+    return field not in obj or isinstance(obj[field], dict)
+
+def _check_field_required_dict(obj: dict, field: str) -> bool:
+    """Checks that the given field is in the object and is a dict.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: whether the given field is in the object and is a dict
+    :rtype: bool
+    """
+    return field in obj and obj[field] != None and isinstance(obj[field], dict)
+
+def _check_field_bool(obj: dict, field: str) -> bool:
+    """Checks that if the given field is in the object, that it is a bool.
+    
+    :param obj: the object to check
+    :type obj: dict
+    :param field: the field to check
+    :type field: str
+    
+    :returns: if the given field is in the object, that it is a bool
+    :rtype: bool
+    """
+    return field not in obj or isinstance(obj[field], bool)
+
+####################################################################################################
+
+def is_valid_eve_user(user: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given user object is valid.
+    
+    A valid user object has an ``_id``, ``firstname``, and ``lastname`` that are non-empty string
+    fields.  If ``email``, ``description``, or ``passwdhash`` are present, they are string fields.
+    If ``role`` is present, it is a list of strings that are either ``administrator`` or ``user``.
+    
+    :param user: user object
+    :type user: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given user object is valid
+    :rtype: bool
+    """
+    if user == None or not isinstance(user, dict):
+        if error_callback:
+            error_callback("Given object is not a dict.")
+        return False
+    if not _check_field_required_string(user, ID_FIELD) or \
+       not _check_field_required_string(user, "firstname") or \
+       not _check_field_required_string(user, "lastname"):
+        if error_callback:
+            error_callback("Given object is missing {}, firstname, or lastname fields.".format(ID_FIELD))
+        return False
+    if not _check_field_string(user, "email") or \
+       not _check_field_string(user, "description") or \
+       not _check_field_string(user, "passwdhash"):
+        if error_callback:
+            error_callback("Fields email, description, or passwd hash are not valid.")
+        return False
+    if "role" in user:
+        if user["role"] == None or (not isinstance(user["role"], list) and not isinstance(user["role"], tuple)):
+            if error_callback:
+                error_callback("Field role is not a list.")
+            return False
+        for role in user["role"]:
+            if role == None or not isinstance(role, str) or role not in ["administrator", "user"]:
+                error_callback("One or mole roles is not valid.")
+                return False
+
+    return True
+
+def is_valid_eve_pipeline(pipeline: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given pipeline object is valid.
+    
+    A valid pipeline has an ``_id``, ``title``, and ``name`` that are non-empty string fields.  If
+    ``description`` is provided, it is a string field.  If ``parameters`` are provided, it is a
+    dict field.
+    
+    :param pipeline: pipeline object
+    :type pipeline: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given pipeline object is valid
+    :rtype: bool
+    """
+    if pipeline == None or not isinstance(pipeline, dict):
+        if error_callback:
+            error_callback("Given object is not a dict.")
+        return False
+    if not _check_field_required_string(pipeline, ID_FIELD) or \
+       not _check_field_required_string(pipeline, "title") or \
+       not _check_field_required_string(pipeline, "name"):
+        if error_callback:
+            error_callback("Given object is missing {}, title, or name fields.".format(ID_FIELD))
+        return False
+    if not _check_field_string(pipeline, "description"):
+        if error_callback:
+            error_callback("Field description is not valid.")
+        return False
+    if "parameters" in pipeline and (pipeline["parameters"] == None or not isinstance(pipeline["parameters"], dict)):
+        if error_callback:
+            error_callback("Field parameters is not a dict.")
+        return False
+                
+    return True
+
+def is_valid_eve_collection(collection: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given collection object is valid.
+    
+    A valid collection has a ``creator_id`` that is a non-empty string field.  It has a ``labels``
+    that is a non-empty list of strings.  If ``annotators`` or ``viewers`` are provided, they are
+    lists of strings.  If ``metadata`` or ``configuration`` are provided, they are dicts.  If
+    ``archived`` is provided, it is a bool.
+    
+    :param collection: collection object
+    :type collection: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given collection object is valid
+    :rtype: bool
+    """
+    if collection == None or not isinstance(collection, dict):
+        if error_callback:
+            error_callback("Given object is not a dict.")
+        return False
+    if not _check_field_required_string(collection, "creator_id") or \
+       not _check_field_required_string_list(collection, "labels"):
+        if error_callback:
+            error_callback("Given object is missing creator_id or labels fields.")
+        return False
+    if not _check_field_string_list(collection, "annotators") or \
+       not _check_field_string_list(collection, "viewers") or \
+       not _check_field_dict(collection, "metadata") or \
+       not _check_field_dict(collection, "configuration") or \
+       not _check_field_bool(collection, "archived"):
+        if error_callback:
+            error_callback("Field annotators, viewers, metadata, configuration, or archived is not valid.")
+        return False
+    
+    return True
+
+def is_valid_collection(form: dict, files: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given form and files parameters are valid for creating a collection.
+    
+    A valid form has a ``collection`` that is a dict field and is valid via
+    :py:func:`.is_valid_eve_collection`.  Additionally, the collection has string ``title`` and
+    ``description`` fields in its ``metadata``.  It also has at least one element for ``labels``,
+    ``viewers``, and ``annotators``, and the ``creator_id`` must be in both ``viewers`` and
+    ``annotators``.
+    
+    The form also has ``overlap`` as a float field between 0 and 1 (inclusive), ``train_every`` as
+    an int field that is at least 5, and ``pipelineId`` as a string field.
+    
+    If files are provided, file ``file`` and any files starting with ``imageFile`` are checked.
+    If a file ``file`` is provided, the form must also have a boolean ``csvHasHeader`` field and an
+    int ``csvTextCol`` field.
+    
+    :param form: form data to send to backend
+    :type form: dict
+    :param files: file data to send to backend
+    :type files: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given form and files parameters are valid for creating a collection
+    :rtype: bool
+    """
+    if not form or not isinstance(form, dict) or not _check_field_required_dict(form, "collection"):
+        if error_callback:
+            error_callback("Missing or invalid collection.")
+        return False
+    collection = form["collection"]
+    if not is_valid_eve_collection(collection, error_callback=error_callback):
+        return False
+    if not _check_field_required_dict(collection, "metadata"):
+        if error_callback:
+            error_callback("Missing or invalid metadata.")
+        return False
+    md = collection["metadata"]
+    if not _check_field_required_string(md, "title") or \
+       not _check_field_required_string(md, "description"):
+        if error_callback:
+            error_callback("Missing metadata title or description.")
+        return False
+    if not _check_field_required_string_list(collection, "labels", 1) or \
+       not _check_field_required_string_list(collection, "viewers", 1) or \
+       not _check_field_required_string_list(collection, "annotators", 1):
+        if error_callback:
+            error_callback("Need at least one label, viewer, and annotator.")
+        return False
+    if collection["creator_id"] not in collection["viewers"] or collection["creator_id"] not in collection["annotators"]:
+        if error_callback:
+            error_callback("Creator ID should be in viewers and annotators.")
+        return False
+
+    if not _check_field_required_float(form, "overlap") or \
+       not _check_field_required_int(form, "train_every") or \
+       not _check_field_required_string(form, "pipelineId"):
+        if error_callback:
+            error_callback("Missing fields overlap, train_every, or pipelineId.")
+        return False
+    if form["overlap"] < 0 or form["overlap"] > 1:
+        if error_callback:
+            error_callback("Field overlap must be between 0 and 1.")
+        return False
+    if form["train_every"] < 5:
+        if error_callback:
+            error_callback("Field train_every must be >= 5.")
+        return False
+
+    if files:
+        for key in files:
+            if key == "file":
+                if not _check_field_required_bool(form, "csvHasHeader") or \
+                  not _check_field_required_int(form, "csvTextCol"):
+                    if error_callback:
+                        error_callback("Missing fields csvHasHeader or csvTextCol.")
+                    return False
+                if not isinstance(files[key], io.IOBase):
+                    if error_callback:
+                        error_callback("File {} is not an open file.".format(key))
+                    return False
+            elif key.startswith("imageFile"):
+                if not isinstance(files[key], io.IOBase):
+                    if error_callback:
+                        error_callback("File {} is not an open file.".format(key))
+                    return False
+
+    return True
+
+def is_valid_eve_document(document: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given document object is valid.
+    
+    A valid document has a ``creator_id`` and ``collection_id`` that are non-empty string fields.
+    Optionally, it may have an int ``overlap`` field, string ``text``field, and dict
+    ``metadata`` and ``has_annotated`` fields.
+    
+    :param document: document object
+    :type document: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given document object is valid
+    :rtype: bool
+    """
+    if document == None or not isinstance(document, dict):
+        if error_callback:
+            error_callback("Given object is not a dict.")
+        return False
+    
+    if not _check_field_required_string(document, "creator_id") or \
+       not _check_field_required_string(document, "collection_id"):
+        if error_callback:
+            error_callback("Missing required string fields creator_id and collection_id.")
+        return False
+
+    if not _check_field_int(document, "overlap") or \
+       not _check_field_string(document, "text") or \
+       not _check_field_dict(document, "metadata") or \
+       not _check_field_dict(document, "has_annotated"):
+        if error_callback:
+            error_callback("Invalid fields overlap, text, metadata, or has_annotated.")
+        return False
+
+    return True
+
+def is_valid_doc_annotation(ann: typing.Any, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given annotation is a valid document label/annotation.
+    
+    This means that it is a non-empty string.
+    
+    :param ann: annotation
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given annotation is a valid document label/annotation
+    :rtype: bool
+    """
+    if not ann or not isinstance(ann, str):
+        if error_callback:
+            error_callback("Doc annotation is not a string.")
+        return False
+    if len(ann.strip()) == 0:
+        if error_callback:
+            error_callback("Doc annotation is empty.")
+        return False
+    return True
+
+def is_valid_ner_annotation(ann: typing.Any, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given annotation is a valid document NER annotation.
+    
+    Valid NER annotations take one of two forms: a :py:class:`dict` or a
+    :py:class:`list`/:py:class:`tuple` of size 3.
+    
+    A valid NER :py:class:`dict` has the following fields:
+    
+    * ``start``: an :py:class:`int` that is >= 0
+    * ``end``: an :py:class:`int` that is >= 0
+    * ``label``: a non-empty :py:class:`str`
+    
+    A valid NER :py:class:`list`/:py:class:`tuple` has the following elements:
+    
+    * element ``0``: an :py:class:`int` that is >= 0
+    * element ``1``: an :py:class:`int` that is >= 0
+    * element ``2``: a non-empty :py:class:`str`
+    
+    :param ann: annotation
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given annotation is a valid document label/annotation
+    :rtype: bool
+    """
+    if isinstance(ann, dict):
+        if not "start" in ann or not ann["start"] or not isinstance(ann["start"], int) or ann["start"] < 0:
+            if error_callback:
+                error_callback("Field start is not valid ({}).".format(ann["start"] if "start" in ann else None))
+            return False
+        if not "end" in ann or not ann["end"] or not isinstance(ann["end"], int) or ann["end"] < 0:
+            if error_callback:
+                error_callback("Field end is not valid ({}).".format(ann["end"] if "end" in ann else None))
+            return False
+        if not "label" in ann or not ann["label"] or not isinstance(ann["label"], str) or len(ann["label"].strip()) == 0:
+            if error_callback:
+                error_callback("Field label is not valid ({}).".format(ann["label"] if "label" in ann else None))
+            return False
+    elif isinstance(ann, list) or isinstance(ann, tuple):
+        if len(ann) != 3:
+            if error_callback:
+                error_callback("Annotation length is not 3.")
+            return False
+        if not isinstance(ann[0], int) or ann[0] < 0 or not isinstance(ann[1], int) or ann[1] < 0 or \
+           not isinstance(ann[2], str) or len(ann[2].strip()) == 0:
+            if error_callback:
+                error_callback("Annotation's first element must be int, second element must be int, third element must be string.")
+            return False
+    else:
+        if error_callback:
+            error_callback("Doc annotation is not a dict  or list({}).".format(type(ann)))
+        return False
+
+    return True
+
+def is_valid_annotation(body: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given body is valid to create an annotation.
+    
+    A valid body is a :py:class:`dict` with two fields:
+    
+    * ``doc``: a list of valid doc annotations (see :py:func:`.is_valid_doc_annotation`)
+    * ``ner``: a list of valid NER annotations (see :py:func:`.is_valid_ner_annotation`)
+    
+    :param body: annotation body
+    :type body: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given body is valid to create an annotation
+    :rtype: bool
+    """
+    if not body or not isinstance(body, dict):
+        if error_callback:
+            error_callback("Body is not a dict ({}).".format(type(body)))
+        return False
+
+    if not _check_field_required_string_list(body, "doc"):
+        if error_callback:
+            error_callback("Missing string list field doc.")
+        return False
+    for ann in body["doc"]:
+        if not is_valid_doc_annotation(ann, error_callback=error_callback):
+            return False
+    if "ner" not in body or (not isinstance(body["ner"], list) and not isinstance(body["ner"], tuple)):
+        if error_callback:
+            error_callback("Invalid NER annotation field ner.")
+        return False
+    for ann in body["ner"]:
+        if not is_valid_ner_annotation(ann, error_callback=error_callback):
+            return False
+
+    return True
+
+def is_valid_doc_annotations(doc_annotations: dict, error_callback: typing.Callable[[str], None] = None) -> bool:
+    """Checks whether the given document annotations are valid.
+    
+    A valid document annotations object is a :py:class:`dict`, where the keys are :py:class:`str`
+    document IDs, and the values are valid annotation bodies (see :py:func:`.is_valid_annotation`).
+    
+    :param doc_annotations: document annotations
+    :type body: dict
+    :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+    :type error_callback: function, optional
+    
+    :returns: whether the given body is valid to create an annotation
+    :rtype: bool
+    """
+    if not doc_annotations or not isinstance(doc_annotations, dict):
+        if error_callback:
+            error_callback("Annotations is not a dict ({}).".format(type(doc_annotations)))
+        return False
+    for doc_id in doc_annotations:
+        if not doc_id or not isinstance(doc_id, str) or len(doc_id.strip()) == 0:
+            if error_callback:
+                error_callback("Document ID is not valid ({}).".format(doc_id))
+            return False
+        annotations = doc_annotations[doc_id]
+        if not is_valid_annotation(annotations, error_callback=error_callback):
+            return False
+
+    return True
+
+####################################################################################################
+
+class CollectionBuilder(object):
+    """A class that can build the form and files fields that are necessary to create a collection.
+    """
+
+    def __init__(self,
+                 collection: dict = None,
+                 creator_id: str = None,
+                 viewers: typing.List[str] = None,
+                 annotators: typing.List[str] = None,
+                 labels: typing.List[str] = None,
+                 title: str = None,
+                 description: str = None,
+                 allow_overlapping_ner_annotations: bool = None,
+                 pipelineId: str = None,
+                 overlap: float = None,
+                 train_every: int = None,
+                 classifierParameters: dict = None,
+                 document_csv_file: str = None,
+                 document_csv_file_has_header: bool = None,
+                 document_csv_file_text_column: int = None,
+                 image_files: typing.List[str] = None):
+        """Constructor.
+        
+        :param collection: starting parameters for the collection, defaults to ``None`` (not set)
+        :type collection: dict, optional
+        :param creator_id: user ID for the creator, see :py:meth:`.creator_id`, defaults to ``None`` (not set)
+        :type creator_id: str, optional
+        :param viewers: viewer IDs for the collection, see :py:meth:`.viewer`, defaults to ``None`` (not set)
+        :type viewers: list(str), optional
+        :param annotators: annotator IDs for the collection, see :py:meth:`.annotator`, defaults to ``None`` (not set)
+        :type annotators: list(str), optional
+        :param labels: labels for the collection, see :py:meth:`.label`, defaults to ``None`` (not set)
+        :type labels: list(str), optional
+        :param title: metadata title, see :py:meth:`.title`, defaults to ``None`` (not set)
+        :type title: str, optional
+        :param description: metadata description, see :py:meth:`.description`, defaults to ``None`` (not set)
+        :type description: str, optional
+        :param allow_overlapping_ner_annotations: optional configuration for allowing overlapping NER
+                                                  annotations, see :py:meth:`.allow_overlapping_ner_annotations`,
+                                                  defaults to ``None`` (not set)
+        :type allow_overlapping_ner_annotations: bool
+        :param pipelineId: the ID of the pipeline from which to create the classifier,
+                           see :py:meth:`.classifier`, defaults to ``None`` (not set)
+        :type pipelineId: str, optional
+        :param overlap: the classifier overlap, see :py:meth:`.classifier`, defaults to ``None`` (not set)
+        :type overlap: float, optional
+        :param train_every: train the model after this many documents are annotated,
+                            see :py:meth:`.classifier`, defaults to ``None`` (not set)
+        :type train_every: int, optional
+        :param classifierParameters: any parameters to pass to the classifier,
+                                     see :py:meth:`.classifier`, defaults to ``None`` (not set)
+        :type classifierParameters: dict, optional
+        :param document_csv_file: the filename of the local document CSV file,
+                                  see :py:meth:`.document_csv_File`, defaults to ``None`` (not set)
+        :type document_csv_file: str, optional
+        :param document_csv_file_has_header: whether the document CSV file has a header,
+                                             see :py:meth:`.document_csv_File`, defaults to ``None`` (not set)
+        :type document_csv_file_has_header: bool, optional
+        :param document_csv_file_text_column: if the document CSV file has headers, the document text
+                                              can be found in this column index (the others are used for
+                                              document metadata), see :py:meth:`.document_csv_File`,
+                                              defaults to ``None`` (not set)
+        :type document_csv_file_text_column: int, optional
+        :param image_files: any image files to add to the collection, see :py:meth:`.image_file`,
+                            defaults to ``None`` (not set)
+        :type image_files: list(str)
+        """
+
+        self.form = {
+            "collection": {
+                "creator_id": None,
+                "metadata": {
+                    "title": None,
+                    "description": None
+                },
+                "configuration": {
+                },
+                "labels": None,
+                "viewers": None,
+                "annotators": None
+            }
+        }
+        """The form data.
+        
+        :type: dict
+        """
+        if collection:
+            self.form["collection"].update(collection)
+        self.files = {}
+        """The files data.
+        
+        :type: dict
+        """
+        self._image_file_counter = 0
+        if creator_id:
+            self.creator_id(creator_id)
+        if viewers:
+            for viewer in viewers: self.viewer(viewer)
+        if annotators:
+            for annotator in annotator: self.annotator(annotator)
+        if labels:
+            for label in labels: self.label(label)
+        if title:
+            self.title(title)
+        if description:
+            self.description(description)
+        if allow_overlapping_ner_annotations != None:
+            self.allow_overlapping_ner_annotations(allow_overlapping_ner_annotations)
+        if document_csv_file and document_csv_file_has_header != None and document_csv_file_text_column != None:
+            self.document_csv_file(document_csv_file, document_csv_file_has_header, document_csv_file_text_column)
+        if image_files:
+            for image_file in image_files: self.image_file(image_file)
+        if pipelineId:
+            kwargs = {}
+            if overlap != None: kwargs["overlap"] = overlap
+            if train_every != None: kwargs["train_every"] = train_every
+            if classifierParameters != None: kwargs["classifierParameters"] = classifierParameters
+            self.classifier(pipelineId, **kwargs)
+
+    @property
+    def collection(self) -> dict:
+        """Returns the collection information from the form.
+        
+        :returns: collection information from the form
+        :rtype: dict
+        """
+        return self.form["collection"]
+
+    @property
+    def form_json(self) -> dict:
+        """Returns the form where the values have been JSON-encoded.
+        
+        :returns: the form where the values have been JSON-encoded
+        :rtype: dict
+        """
+        return {key: json.dumps(value) for (key, value) in self.form.items()}
+
+    def creator_id(self, user_id: str) -> "CollectionBuilder":
+        """Sets the creator_id to the given, and adds to viewers and annotators.
+        
+        :param user_id: the user ID to use for the creator_id
+        :type user_id: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        self.collection["creator_id"] = user_id
+        self.viewer(user_id)
+        self.annotator(user_id)
+        return self
+
+    def viewer(self, user_id: str) -> "CollectionBuilder":
+        """Adds the given user to the list of viewers.
+        
+        :param user_id: the user ID to add as a viewer
+        :type user_id: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        if not self.collection["viewers"]:
+            self.collection["viewers"] = [user_id]
+        elif user_id not in self.collection["viewers"]:
+            self.collection["viewers"].append(user_id)
+        return self
+
+    def annotator(self, user_id: str) -> "CollectionBuilder":
+        """Adds the given user to the list of annotators.
+        
+        :param user_id: the user ID to add as an annotator
+        :type user_id: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        if not self.collection["annotators"]:
+            self.collection["annotators"] = [user_id]
+        elif user_id not in self.collection["annotators"]:
+            self.collection["annotators"].append(user_id)
+        return self
+
+    def label(self, label: str) -> "CollectionBuilder":
+        """Adds the given label to the collection.
+        
+        :param label: label to add
+        :type label: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        if not self.collection["labels"]:
+            self.collection["labels"] = [label]
+        elif label not in self.collection["labels"]:
+            self.collection["labels"].append(label)
+        return self
+
+    def metadata(self, key: str, value: typing.Any) -> "CollectionBuilder":
+        """Adds the given metadata key/value to the collection.
+        
+        :param key: metadata key
+        :type key: str
+        :param value: metadata value
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        self.collection["metadata"][key] = value
+        return self
+
+    def title(self, title: str) -> "CollectionBuilder":
+        """Sets the metadata title to the given.
+        
+        :param title: collection title
+        :type title: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        return self.metadata("title", title)
+
+    def description(self, description: str) -> "CollectionBuilder":
+        """Sets the metadata description to the given.
+        
+        :param description: collection description
+        :type description: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        return self.metadata("description", description)
+
+    def configuration(self, key: str, value: typing.Any) -> "CollectionBuilder":
+        """Adds the given configuration key/value to the collection.
+        
+        :param key: configuration key
+        :type key: str
+        :param value: configuration value
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        self.collection["configuration"][key] = value
+        return self
+
+    def allow_overlapping_ner_annotations(self, allow_overlapping_ner_annotations: bool):
+        """Sets the configuration value for allow_overlapping_ner_annotations to the given.
+        
+        :param allow_overlapping_ner_annotations: whether to allow overlapping NER annotations
+        :type allow_overlapping_ner_annotations: bool
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        return self.configuration("allow_overlapping_ner_annotations", allow_overlapping_ner_annotations)
+
+    def classifier(self, pipelineId: str, overlap: float = 0, train_every: int = 100, classifierParameters: dict = {}) -> "CollectionBuilder":
+        """Sets classifier information for the created collection.
+        
+        :param pipelineId: the ID of the pipeline from which to create the classifier
+        :type pipelineId: str
+        :param overlap: the classifier overlap, defaults to `0`
+        :type overlap: float, optional
+        :param train_every: train the model after this many documents are annotated, defaults to `100`
+        :type train_every: int, optional
+        :param classifierParameters: any parameters to pass to the classifier, defaults to ``{}``
+        :type classifierParameters: dict, optional
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        self.form["pipelineId"] = pipelineId
+        self.form["overlap"] = overlap
+        self.form["train_every"] = train_every
+        self.form["classifierParameters"] = classifierParameters
+        return self
+
+    def document_csv_file(self, csv_filename: str, has_header: bool, text_column: int) -> "CollectionBuilder":
+        """Sets the CSV file used to create documents to the given.
+        
+        May raise an Exception if there is a problem opening the indicated file.
+        
+        :param csv_filename: the filename of the local CSV file
+        :type csv_filename: str
+        :param has_header: whether the CSV file has a header
+        :type has_header: bool
+        :param text_column: if the CSV file has headers, the document text can be found in this column index
+                            (the others are used for document metadata)
+        :type text_column: int
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        if "file" in self.files and self.files["file"]:
+            self.files["file"].close()
+        self.files["file"] = open(csv_filename, "rb")
+        self.form["csvHasHeader"] = has_header
+        self.form["csvTextCol"] = text_column
+        return self
+
+    def image_file(self, image_filename: str) -> "CollectionBuilder":
+        """Adds the given image file to the collection.
+        
+        May raise an Exception if there is a problem opening the indicated file.
+        
+        :param image_filename: the filename of the local image file
+        :type image_filename: str
+        
+        :returns: self
+        :rtype: models.CollectionBuilder
+        """
+        self.files["imageFile{}".format(self._image_file_counter)] = open(image_filename, "rb")
+        self._image_file_counter += 1
+        return self
+
+    def is_valid(self, error_callback: typing.Callable[[str], None] = None):
+        """Checks whether the currently set values are valid or not.
+        
+        See :py:func:`.is_valid_collection`.
+        
+        :param error_callback: optional callback that is called with any error messages, defaults to ``None``
+        :type error_callback: function, optional
+        
+        :returns: whether the currently set values are valid or not
+        :rtype: bool
+        """
+        return is_valid_collection(self.form, self.files, error_callback=error_callback)
+
+####################################################################################################
+
+def remove_eve_fields(obj: dict, remove_timestamps: bool = True, remove_versions: bool = True):
+    """Removes fields inserted by eve from the given object.
+    
+    :param obj: the object
+    :type obj: dict
+    :param remove_timestamps: whether to remove the timestamp fields, defaults to ``True``
+    :type remove_timestamps: bool
+    :param remove_versions: whether to remove the version fields, defaults to ``True``
+    :type remove_versions: bool
+    """
+    fields = ["_etag", "_links"]
+    if remove_timestamps: fields += ["_created", "_updated"]
+    if remove_versions: fields += ["_version", "_latest_version"]
+    for f in fields:
+        if f in obj:
+            del obj[f]
+
+def remove_nonupdatable_fields(obj: dict):
+    """Removes all non-updatable fields from the given object.
+    
+    These fields would cause a ``PUT``/``PATCH`` to be rejected because they are not user-modifiable.
+    
+    :param obj: the object
+    :type obj: dict
+    """
+    remove_eve_fields(obj)
--- a/client/pine/client/password.py
+++ b/client/pine/client/password.py
@@ -0,0 +1,33 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+import base64
+import bcrypt
+import hashlib
+
+def hash_password(password: str) -> str:
+    """Hashes the given password for use in user object.
+    
+    :param password: password
+    :type password: str
+    
+    :returns: hashed password
+    :rtype: str
+    """
+    sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
+    hashed_password_bytes = bcrypt.hashpw(sha256, bcrypt.gensalt())
+    return base64.b64encode(hashed_password_bytes).decode()
+
+def check_password(password: str, hashed_password: str) -> str:
+    """Checks the given password against the given hash.
+    
+    :param password: password to check
+    :type password: str
+    :param hashed_password: hashed password to check against
+    :type hashed_password: str
+    
+    :returns: whether the password matches the hash
+    :rtype: bool
+    """
+    sha256 = hashlib.sha256(password.encode()).digest().replace(b"\x00", b"")
+    hashed_password_bytes = base64.b64decode(hashed_password.encode())
+    return bcrypt.checkpw(sha256, hashed_password_bytes)
--- a/docker-compose.override.yml
+++ b/docker-compose.override.yml
@@ -5,20 +5,22 @@ services:

  backend:
    environment:
+      - AUTH_MODULE=${AUTH_MODULE}
      - VEGAS_CLIENT_SECRET
-      - EVE_SERVER=http://eve:7510
+      - EVE_SERVER=http://eve:${EVE_PORT}
      - REDIS_SERVER=redis
-      - PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
+      - PINE_LOGGING_CONFIG_FILE=${PINE_LOGGING_CONFIG_FILE:-/nlp-web-app/shared/logging.python.dev.json}

  eve:
    build:
      args:
        - DB_DIR=/nlp-web-app/eve/db
+        - MONGO_PORT=${MONGO_PORT}
    volumes:
-      - eve_db:/nlp-web-app/eve/db
+      - ${EVE_DB_VOLUME}:/nlp-web-app/eve/db
    environment:
      - MONGO_URI=
-      - PINE_LOGGING_CONFIG_FILE=/nlp-web-app/shared/logging.python.dev.json
+      - PINE_LOGGING_CONFIG_FILE=${PINE_LOGGING_CONFIG_FILE:-/nlp-web-app/shared/logging.python.dev.json}

  frontend_annotation:
    build:
--- a/docker-compose.test.yml
+++ b/docker-compose.test.yml
@@ -0,0 +1,16 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+version: "3"
+
+services:
+
+  backend:
+    ports:
+      - "${BACKEND_PORT}:${BACKEND_PORT}"
+
+  eve:
+    ports:
+      - "${EVE_PORT}:${EVE_PORT}"
+      - "${MONGO_PORT}:${MONGO_PORT}"
+
+volumes:
+  eve_test_db:
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -48,13 +48,14 @@ services:
    volumes:
      - ${SHARED_VOLUME}:/nlp-web-app/shared
      - ${LOGS_VOLUME}:/nlp-web-app/logs
-      - ${DOCUMENT_IMAGE_VOLUME}:/nlp-web-app/document_images
+      - ${DOCUMENT_IMAGE_VOLUME}:/mnt/azure
    environment:
      AL_REDIS_HOST: redis
      AL_REDIS_PORT: ${REDIS_PORT}
      AUTH_MODULE: ${AUTH_MODULE}
      PINE_LOGGING_CONFIG_FILE: /nlp-web-app/shared/logging.python.json
-      DOCUMENT_IMAGE_DIR: /nlp-web-app/document_images
+      DOCUMENT_IMAGE_DIR: /mnt/azure
+      PINE_VERSION: ${PINE_VERSION:?Please set PINE_VERSION environment variable.}
    # Expose the following to test:
 #    ports:
 #      - ${BACKEND_PORT}:${BACKEND_PORT}
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/Pipfile
+++ b/docs/Pipfile
@@ -0,0 +1,18 @@
+# (C) 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+[[source]]
+url = "https://pypi.org/simple"
+verify_ssl = true
+name = "pypi"
+
+[packages]
+
+[dev-packages]
+sphinx = "*"
+sphinx-autoapi = "*"
+
+[scripts]
+doc = "make singlehtml html latexpdf LATEXMKOPTS='-silent'"
+
+[requires]
+python_version = "3.6"
--- a/docs/Pipfile.lock
+++ b/docs/Pipfile.lock
@@ -0,0 +1,319 @@
+{
+    "_meta": {
+        "hash": {
+            "sha256": "aab7848f4527a249ac0b2421bb300c9995a4bf089517eabaf28ffd1997fd12a0"
+        },
+        "pipfile-spec": 6,
+        "requires": {
+            "python_version": "3.6"
+        },
+        "sources": [
+            {
+                "name": "pypi",
+                "url": "https://pypi.org/simple",
+                "verify_ssl": true
+            }
+        ]
+    },
+    "default": {},
+    "develop": {
+        "alabaster": {
+            "hashes": [
+                "sha256:446438bdcca0e05bd45ea2de1668c1d9b032e1a9154c2c259092d77031ddd359",
+                "sha256:a661d72d58e6ea8a57f7a86e37d86716863ee5e92788398526d58b26a4e4dc02"
+            ],
+            "version": "==0.7.12"
+        },
+        "astroid": {
+            "hashes": [
+                "sha256:2f4078c2a41bf377eea06d71c9d2ba4eb8f6b1af2135bec27bbbb7d8f12bb703",
+                "sha256:bc58d83eb610252fd8de6363e39d4f1d0619c894b0ed24603b881c02e64c7386"
+            ],
+            "markers": "python_version >= '3'",
+            "version": "==2.4.2"
+        },
+        "babel": {
+            "hashes": [
+                "sha256:1aac2ae2d0d8ea368fa90906567f5c08463d98ade155c0c4bfedd6a0f7160e38",
+                "sha256:d670ea0b10f8b723672d3a6abeb87b565b244da220d76b4dba1b66269ec152d4"
+            ],
+            "version": "==2.8.0"
+        },
+        "certifi": {
+            "hashes": [
+                "sha256:5930595817496dd21bb8dc35dad090f1c2cd0adfaf21204bf6732ca5d8ee34d3",
+                "sha256:8fc0819f1f30ba15bdb34cceffb9ef04d99f420f68eb75d901e9560b8749fc41"
+            ],
+            "version": "==2020.6.20"
+        },
+        "chardet": {
+            "hashes": [
+                "sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
+                "sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
+            ],
+            "version": "==3.0.4"
+        },
+        "docutils": {
+            "hashes": [
+                "sha256:0c5b78adfbf7762415433f5515cd5c9e762339e23369dbe8000d84a4bf4ab3af",
+                "sha256:c2de3a60e9e7d07be26b7f2b00ca0309c207e06c100f9cc2a94931fc75a478fc"
+            ],
+            "version": "==0.16"
+        },
+        "idna": {
+            "hashes": [
+                "sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6",
+                "sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0"
+            ],
+            "version": "==2.10"
+        },
+        "imagesize": {
+            "hashes": [
+                "sha256:6965f19a6a2039c7d48bca7dba2473069ff854c36ae6f19d2cde309d998228a1",
+                "sha256:b1f6b5a4eab1f73479a50fb79fcf729514a900c341d8503d62a62dbc4127a2b1"
+            ],
+            "version": "==1.2.0"
+        },
+        "jinja2": {
+            "hashes": [
+                "sha256:89aab215427ef59c34ad58735269eb58b1a5808103067f7bb9d5836c651b3bb0",
+                "sha256:f0a4641d3cf955324a89c04f3d94663aa4d638abe8f733ecd3582848e1c37035"
+            ],
+            "version": "==2.11.2"
+        },
+        "lazy-object-proxy": {
+            "hashes": [
+                "sha256:0c4b206227a8097f05c4dbdd323c50edf81f15db3b8dc064d08c62d37e1a504d",
+                "sha256:194d092e6f246b906e8f70884e620e459fc54db3259e60cf69a4d66c3fda3449",
+                "sha256:1be7e4c9f96948003609aa6c974ae59830a6baecc5376c25c92d7d697e684c08",
+                "sha256:4677f594e474c91da97f489fea5b7daa17b5517190899cf213697e48d3902f5a",
+                "sha256:48dab84ebd4831077b150572aec802f303117c8cc5c871e182447281ebf3ac50",
+                "sha256:5541cada25cd173702dbd99f8e22434105456314462326f06dba3e180f203dfd",
+                "sha256:59f79fef100b09564bc2df42ea2d8d21a64fdcda64979c0fa3db7bdaabaf6239",
+                "sha256:8d859b89baf8ef7f8bc6b00aa20316483d67f0b1cbf422f5b4dc56701c8f2ffb",
+                "sha256:9254f4358b9b541e3441b007a0ea0764b9d056afdeafc1a5569eee1cc6c1b9ea",
+                "sha256:9651375199045a358eb6741df3e02a651e0330be090b3bc79f6d0de31a80ec3e",
+                "sha256:97bb5884f6f1cdce0099f86b907aa41c970c3c672ac8b9c8352789e103cf3156",
+                "sha256:9b15f3f4c0f35727d3a0fba4b770b3c4ebbb1fa907dbcc046a1d2799f3edd142",
+                "sha256:a2238e9d1bb71a56cd710611a1614d1194dc10a175c1e08d75e1a7bcc250d442",
+                "sha256:a6ae12d08c0bf9909ce12385803a543bfe99b95fe01e752536a60af2b7797c62",
+                "sha256:ca0a928a3ddbc5725be2dd1cf895ec0a254798915fb3a36af0964a0a4149e3db",
+                "sha256:cb2c7c57005a6804ab66f106ceb8482da55f5314b7fcb06551db1edae4ad1531",
+                "sha256:d74bb8693bf9cf75ac3b47a54d716bbb1a92648d5f781fc799347cfc95952383",
+                "sha256:d945239a5639b3ff35b70a88c5f2f491913eb94871780ebfabb2568bd58afc5a",
+                "sha256:eba7011090323c1dadf18b3b689845fd96a61ba0a1dfbd7f24b921398affc357",
+                "sha256:efa1909120ce98bbb3777e8b6f92237f5d5c8ea6758efea36a473e1d38f7d3e4",
+                "sha256:f3900e8a5de27447acbf900b4750b0ddfd7ec1ea7fbaf11dfa911141bc522af0"
+            ],
+            "version": "==1.4.3"
+        },
+        "markupsafe": {
+            "hashes": [
+                "sha256:00bc623926325b26bb9605ae9eae8a215691f33cae5df11ca5424f06f2d1f473",
+                "sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161",
+                "sha256:09c4b7f37d6c648cb13f9230d847adf22f8171b1ccc4d5682398e77f40309235",
+                "sha256:1027c282dad077d0bae18be6794e6b6b8c91d58ed8a8d89a89d59693b9131db5",
+                "sha256:13d3144e1e340870b25e7b10b98d779608c02016d5184cfb9927a9f10c689f42",
+                "sha256:24982cc2533820871eba85ba648cd53d8623687ff11cbb805be4ff7b4c971aff",
+                "sha256:29872e92839765e546828bb7754a68c418d927cd064fd4708fab9fe9c8bb116b",
+                "sha256:43a55c2930bbc139570ac2452adf3d70cdbb3cfe5912c71cdce1c2c6bbd9c5d1",
+                "sha256:46c99d2de99945ec5cb54f23c8cd5689f6d7177305ebff350a58ce5f8de1669e",
+                "sha256:500d4957e52ddc3351cabf489e79c91c17f6e0899158447047588650b5e69183",
+                "sha256:535f6fc4d397c1563d08b88e485c3496cf5784e927af890fb3c3aac7f933ec66",
+                "sha256:596510de112c685489095da617b5bcbbac7dd6384aeebeda4df6025d0256a81b",
+                "sha256:62fe6c95e3ec8a7fad637b7f3d372c15ec1caa01ab47926cfdf7a75b40e0eac1",
+                "sha256:6788b695d50a51edb699cb55e35487e430fa21f1ed838122d722e0ff0ac5ba15",
+                "sha256:6dd73240d2af64df90aa7c4e7481e23825ea70af4b4922f8ede5b9e35f78a3b1",
+                "sha256:717ba8fe3ae9cc0006d7c451f0bb265ee07739daf76355d06366154ee68d221e",
+                "sha256:79855e1c5b8da654cf486b830bd42c06e8780cea587384cf6545b7d9ac013a0b",
+                "sha256:7c1699dfe0cf8ff607dbdcc1e9b9af1755371f92a68f706051cc8c37d447c905",
+                "sha256:88e5fcfb52ee7b911e8bb6d6aa2fd21fbecc674eadd44118a9cc3863f938e735",
+                "sha256:8defac2f2ccd6805ebf65f5eeb132adcf2ab57aa11fdf4c0dd5169a004710e7d",
+                "sha256:98c7086708b163d425c67c7a91bad6e466bb99d797aa64f965e9d25c12111a5e",
+                "sha256:9add70b36c5666a2ed02b43b335fe19002ee5235efd4b8a89bfcf9005bebac0d",
+                "sha256:9bf40443012702a1d2070043cb6291650a0841ece432556f784f004937f0f32c",
+                "sha256:ade5e387d2ad0d7ebf59146cc00c8044acbd863725f887353a10df825fc8ae21",
+                "sha256:b00c1de48212e4cc9603895652c5c410df699856a2853135b3967591e4beebc2",
+                "sha256:b1282f8c00509d99fef04d8ba936b156d419be841854fe901d8ae224c59f0be5",
+                "sha256:b2051432115498d3562c084a49bba65d97cf251f5a331c64a12ee7e04dacc51b",
+                "sha256:ba59edeaa2fc6114428f1637ffff42da1e311e29382d81b339c1817d37ec93c6",
+                "sha256:c8716a48d94b06bb3b2524c2b77e055fb313aeb4ea620c8dd03a105574ba704f",
+                "sha256:cd5df75523866410809ca100dc9681e301e3c27567cf498077e8551b6d20e42f",
+                "sha256:cdb132fc825c38e1aeec2c8aa9338310d29d337bebbd7baa06889d09a60a1fa2",
+                "sha256:e249096428b3ae81b08327a63a485ad0878de3fb939049038579ac0ef61e17e7",
+                "sha256:e8313f01ba26fbbe36c7be1966a7b7424942f670f38e666995b88d012765b9be"
+            ],
+            "version": "==1.1.1"
+        },
+        "packaging": {
+            "hashes": [
+                "sha256:4357f74f47b9c12db93624a82154e9b120fa8293699949152b22065d556079f8",
+                "sha256:998416ba6962ae7fbd6596850b80e17859a5753ba17c32284f67bfff33784181"
+            ],
+            "version": "==20.4"
+        },
+        "pygments": {
+            "hashes": [
+                "sha256:647344a061c249a3b74e230c739f434d7ea4d8b1d5f3721bc0f3558049b38f44",
+                "sha256:ff7a40b4860b727ab48fad6360eb351cc1b33cbf9b15a0f689ca5353e9463324"
+            ],
+            "version": "==2.6.1"
+        },
+        "pyparsing": {
+            "hashes": [
+                "sha256:c203ec8783bf771a155b207279b9bccb8dea02d8f0c9e5f8ead507bc3246ecc1",
+                "sha256:ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"
+            ],
+            "version": "==2.4.7"
+        },
+        "pytz": {
+            "hashes": [
+                "sha256:a494d53b6d39c3c6e44c3bec237336e14305e4f29bbf800b599253057fbb79ed",
+                "sha256:c35965d010ce31b23eeb663ed3cc8c906275d6be1a34393a1d73a41febf4a048"
+            ],
+            "version": "==2020.1"
+        },
+        "pyyaml": {
+            "hashes": [
+                "sha256:06a0d7ba600ce0b2d2fe2e78453a470b5a6e000a985dd4a4e54e436cc36b0e97",
+                "sha256:240097ff019d7c70a4922b6869d8a86407758333f02203e0fc6ff79c5dcede76",
+                "sha256:4f4b913ca1a7319b33cfb1369e91e50354d6f07a135f3b901aca02aa95940bd2",
+                "sha256:69f00dca373f240f842b2931fb2c7e14ddbacd1397d57157a9b005a6a9942648",
+                "sha256:73f099454b799e05e5ab51423c7bcf361c58d3206fa7b0d555426b1f4d9a3eaf",
+                "sha256:74809a57b329d6cc0fdccee6318f44b9b8649961fa73144a98735b0aaf029f1f",
+                "sha256:7739fc0fa8205b3ee8808aea45e968bc90082c10aef6ea95e855e10abf4a37b2",
+                "sha256:95f71d2af0ff4227885f7a6605c37fd53d3a106fcab511b8860ecca9fcf400ee",
+                "sha256:b8eac752c5e14d3eca0e6dd9199cd627518cb5ec06add0de9d32baeee6fe645d",
+                "sha256:cc8955cfbfc7a115fa81d85284ee61147059a753344bc51098f3ccd69b0d7e0c",
+                "sha256:d13155f591e6fcc1ec3b30685d50bf0711574e2c0dfffd7644babf8b5102ca1a"
+            ],
+            "version": "==5.3.1"
+        },
+        "requests": {
+            "hashes": [
+                "sha256:b3559a131db72c33ee969480840fff4bb6dd111de7dd27c8ee1f820f4f00231b",
+                "sha256:fe75cc94a9443b9246fc7049224f75604b113c36acb93f87b80ed42c44cbb898"
+            ],
+            "version": "==2.24.0"
+        },
+        "six": {
+            "hashes": [
+                "sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259",
+                "sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"
+            ],
+            "version": "==1.15.0"
+        },
+        "snowballstemmer": {
+            "hashes": [
+                "sha256:209f257d7533fdb3cb73bdbd24f436239ca3b2fa67d56f6ff88e86be08cc5ef0",
+                "sha256:df3bac3df4c2c01363f3dd2cfa78cce2840a79b9f1c2d2de9ce8d31683992f52"
+            ],
+            "version": "==2.0.0"
+        },
+        "sphinx": {
+            "hashes": [
+                "sha256:97dbf2e31fc5684bb805104b8ad34434ed70e6c588f6896991b2fdfd2bef8c00",
+                "sha256:b9daeb9b39aa1ffefc2809b43604109825300300b987a24f45976c001ba1a8fd"
+            ],
+            "index": "pypi",
+            "version": "==3.1.2"
+        },
+        "sphinx-autoapi": {
+            "hashes": [
+                "sha256:eb86024fb04f6f1c61d8be73f56db40bf730a932cf0c8d0456a43bae4c11b508",
+                "sha256:f76ef71d443c6a9ad5e1b326d4dfc196e2080e8b46141c45d1bb47a73a34f190"
+            ],
+            "index": "pypi",
+            "version": "==1.4.0"
+        },
+        "sphinxcontrib-applehelp": {
+            "hashes": [
+                "sha256:806111e5e962be97c29ec4c1e7fe277bfd19e9652fb1a4392105b43e01af885a",
+                "sha256:a072735ec80e7675e3f432fcae8610ecf509c5f1869d17e2eecff44389cdbc58"
+            ],
+            "version": "==1.0.2"
+        },
+        "sphinxcontrib-devhelp": {
+            "hashes": [
+                "sha256:8165223f9a335cc1af7ffe1ed31d2871f325254c0423bc0c4c7cd1c1e4734a2e",
+                "sha256:ff7f1afa7b9642e7060379360a67e9c41e8f3121f2ce9164266f61b9f4b338e4"
+            ],
+            "version": "==1.0.2"
+        },
+        "sphinxcontrib-htmlhelp": {
+            "hashes": [
+                "sha256:3c0bc24a2c41e340ac37c85ced6dafc879ab485c095b1d65d2461ac2f7cca86f",
+                "sha256:e8f5bb7e31b2dbb25b9cc435c8ab7a79787ebf7f906155729338f3156d93659b"
+            ],
+            "version": "==1.0.3"
+        },
+        "sphinxcontrib-jsmath": {
+            "hashes": [
+                "sha256:2ec2eaebfb78f3f2078e73666b1415417a116cc848b72e5172e596c871103178",
+                "sha256:a9925e4a4587247ed2191a22df5f6970656cb8ca2bd6284309578f2153e0c4b8"
+            ],
+            "version": "==1.0.1"
+        },
+        "sphinxcontrib-qthelp": {
+            "hashes": [
+                "sha256:4c33767ee058b70dba89a6fc5c1892c0d57a54be67ddd3e7875a18d14cba5a72",
+                "sha256:bd9fc24bcb748a8d51fd4ecaade681350aa63009a347a8c14e637895444dfab6"
+            ],
+            "version": "==1.0.3"
+        },
+        "sphinxcontrib-serializinghtml": {
+            "hashes": [
+                "sha256:eaa0eccc86e982a9b939b2b82d12cc5d013385ba5eadcc7e4fed23f4405f77bc",
+                "sha256:f242a81d423f59617a8e5cf16f5d4d74e28ee9a66f9e5b637a18082991db5a9a"
+            ],
+            "version": "==1.1.4"
+        },
+        "typed-ast": {
+            "hashes": [
+                "sha256:0666aa36131496aed8f7be0410ff974562ab7eeac11ef351def9ea6fa28f6355",
+                "sha256:0c2c07682d61a629b68433afb159376e24e5b2fd4641d35424e462169c0a7919",
+                "sha256:249862707802d40f7f29f6e1aad8d84b5aa9e44552d2cc17384b209f091276aa",
+                "sha256:24995c843eb0ad11a4527b026b4dde3da70e1f2d8806c99b7b4a7cf491612652",
+                "sha256:269151951236b0f9a6f04015a9004084a5ab0d5f19b57de779f908621e7d8b75",
+                "sha256:4083861b0aa07990b619bd7ddc365eb7fa4b817e99cf5f8d9cf21a42780f6e01",
+                "sha256:498b0f36cc7054c1fead3d7fc59d2150f4d5c6c56ba7fb150c013fbc683a8d2d",
+                "sha256:4e3e5da80ccbebfff202a67bf900d081906c358ccc3d5e3c8aea42fdfdfd51c1",
+                "sha256:6daac9731f172c2a22ade6ed0c00197ee7cc1221aa84cfdf9c31defeb059a907",
+                "sha256:715ff2f2df46121071622063fc7543d9b1fd19ebfc4f5c8895af64a77a8c852c",
+                "sha256:73d785a950fc82dd2a25897d525d003f6378d1cb23ab305578394694202a58c3",
+                "sha256:8c8aaad94455178e3187ab22c8b01a3837f8ee50e09cf31f1ba129eb293ec30b",
+                "sha256:8ce678dbaf790dbdb3eba24056d5364fb45944f33553dd5869b7580cdbb83614",
+                "sha256:aaee9905aee35ba5905cfb3c62f3e83b3bec7b39413f0a7f19be4e547ea01ebb",
+                "sha256:bcd3b13b56ea479b3650b82cabd6b5343a625b0ced5429e4ccad28a8973f301b",
+                "sha256:c9e348e02e4d2b4a8b2eedb48210430658df6951fa484e59de33ff773fbd4b41",
+                "sha256:d205b1b46085271b4e15f670058ce182bd1199e56b317bf2ec004b6a44f911f6",
+                "sha256:d43943ef777f9a1c42bf4e552ba23ac77a6351de620aa9acf64ad54933ad4d34",
+                "sha256:d5d33e9e7af3b34a40dc05f498939f0ebf187f07c385fd58d591c533ad8562fe",
+                "sha256:fc0fea399acb12edbf8a628ba8d2312f583bdbdb3335635db062fa98cf71fca4",
+                "sha256:fe460b922ec15dd205595c9b5b99e2f056fd98ae8f9f56b888e7a17dc2b757e7"
+            ],
+            "markers": "implementation_name == 'cpython' and python_version < '3.8'",
+            "version": "==1.4.1"
+        },
+        "unidecode": {
+            "hashes": [
+                "sha256:1d7a042116536098d05d599ef2b8616759f02985c85b4fef50c78a5aaf10822a",
+                "sha256:2b6aab710c2a1647e928e36d69c21e76b453cd455f4e2621000e54b2a9b8cce8"
+            ],
+            "version": "==1.1.1"
+        },
+        "urllib3": {
+            "hashes": [
+                "sha256:3018294ebefce6572a474f0604c2021e33b3fd8006ecd11d62107a5d2a963527",
+                "sha256:88206b0eb87e6d677d424843ac5209e3fb9d0190d0ee169599165ec25e9d9115"
+            ],
+            "version": "==1.25.9"
+        },
+        "wrapt": {
+            "hashes": [
+                "sha256:b62ffa81fb85f4332a4f609cab4ac40709470da05643a082ec1eb88e6d9b97d7"
+            ],
+            "version": "==1.12.1"
+        }
+    }
+}
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,20 @@
+&copy; 2019 The Johns Hopkins University Applied Physics Laboratory LLC.
+
+## Developer Environment
+
+Required packages:
+* python 3.6
+* pipenv
+* make
+
+`pipenv install --dev`
+
+### Generating documentation
+
+Make sure you have the proper latex packages installed:
+* `sudo apt install latexmk texlive-latex-recommended texlive-fonts-recommended texlive-latex-extra`
+
+Generate through pipenv:
+* `pipenv run doc` or use convenience script `../generate_documentation.sh`
+
+Generated documentation can then be found in `./build`.
--- a/docs/build/doctrees/autoapi/index.doctree
+++ b/docs/build/doctrees/autoapi/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/admin/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/admin/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/admin/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/admin/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/annotations/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/annotations/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/annotations/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/annotations/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/app/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/app/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/eve/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/eve/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/oauth/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/oauth/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/password/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/password/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/auth/vegas/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/auth/vegas/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/collections/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/collections/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/collections/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/collections/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/config/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/config/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/cors/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/cors/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/data/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/data/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/data/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/data/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/data/service/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/data/service/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/data/users/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/data/users/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/documents/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/documents/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/documents/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/documents/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/job_manager/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/job_manager/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/job_manager/service/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/job_manager/service/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/log/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/log/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/models/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/models/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/agree/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/agree/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/agree_cli/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/agree_cli/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/evaluation/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/evaluation/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/iaa_service/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/iaa_service/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/utils/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/bratiaa/utils/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pineiaa/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pineiaa/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pipelines/bp/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pipelines/bp/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/pipelines/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/pipelines/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/shared/config/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/shared/config/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/shared/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/shared/index.doctree
--- a/docs/build/doctrees/autoapi/pine/backend/shared/transform/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/backend/shared/transform/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/client/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/client/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/exceptions/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/exceptions/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/log/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/log/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/models/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/models/index.doctree
--- a/docs/build/doctrees/autoapi/pine/client/password/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/client/password/index.doctree
--- a/docs/build/doctrees/autoapi/pine/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/EveClient/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/EveClient/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/NERWrapper/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/NERWrapper/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/NER_API/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/NER_API/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/RankingFunctions/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/RankingFunctions/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/app/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/app/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/app/listener/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/app/listener/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/app/listener/main/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/app/listener/main/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/app/listener/service_listener/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/app/listener/service_listener/index.doctree
--- a/docs/build/doctrees/autoapi/pine/pipelines/corenlp_NER_pipeline/index.doctree
+++ b/docs/build/doctrees/autoapi/pine/pipelines/corenlp_NER_pipeline/index.doctree
--- a/Show More
+++ b/Show More