mirror of
https://github.com/JHUAPL/AccumuloGraph.git
synced 2026-01-09 20:57:55 -05:00
Lots of updates to README.md
This commit is contained in:
145
README.md
145
README.md
@@ -4,6 +4,27 @@ AccumuloGraph
|
||||
|
||||
This is an implementation of the [TinkerPop Blueprints](http://tinkerpop.com)
|
||||
2.6 API using [Apache Accumulo](http://apache.accumulo.com) as the backend.
|
||||
This combines the benefits and flexibility of Blueprints
|
||||
with the scalability and performance of Accumulo.
|
||||
|
||||
In addition to the basic Blueprints functionality, we provide additional
|
||||
features that harness more of Accumulo's power.
|
||||
|
||||
Some features include...
|
||||
|
||||
|
||||
Benchmarks
|
||||
|
||||
|
||||
Indexing via the `IndexableGraph` and `KeyIndexableGraph` interfaces.
|
||||
|
||||
Benchmarking
|
||||
|
||||
Feel free to email with suggestions for improvements.
|
||||
Please submit issues for any bugs you find or features you want.
|
||||
We are also open to pull requests.
|
||||
|
||||
|
||||
This implementation provides easy to use, easy to write, and easy to read
|
||||
access to an arbitrarily large graph that is stored in Accumulo.
|
||||
|
||||
@@ -12,12 +33,15 @@ We implement the following Blueprints interfaces:
|
||||
<br>2. KeyIndexableGraph
|
||||
<br>3. IndexableGraph
|
||||
|
||||
Please feel free to submit issues for any bugs you find or features you want.
|
||||
We are open to pull requests from your forks also.
|
||||
Benchmarking.
|
||||
|
||||
##Usage
|
||||
|
||||
The releases are currently stored in Maven Central.
|
||||
|
||||
## Getting Started
|
||||
|
||||
First, include AccumuloGraph as a Maven dependency. Releases are deployed
|
||||
to Maven Central.
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>edu.jhuapl.tinkerpop</groupId>
|
||||
@@ -26,8 +50,119 @@ The releases are currently stored in Maven Central.
|
||||
</dependency>
|
||||
```
|
||||
|
||||
For non-Maven users, the binaries can be found in the releases section in this
|
||||
For non-Maven users, the binary jars can be found in the releases section in this
|
||||
GitHub repository, or you can get them from Maven Central.
|
||||
|
||||
Creating an `AccumuloGraph` involves setting a few parameters in an
|
||||
`AccumuloGraphConfiguration` object, and opening the graph.
|
||||
The defaults are sensible for using an Accumulo cluster.
|
||||
We provide some simple examples below. Javadocs for
|
||||
`AccumuloGraphConfiguration` explain all the other parameters
|
||||
in more detail.
|
||||
|
||||
First, to instantiate an in-memory graph:
|
||||
```java
|
||||
Configuration cfg = new AccumuloGraphConfiguration()
|
||||
.setInstanceType(InstanceType.Mock)
|
||||
.setGraphName("graph");
|
||||
return GraphFactory.open(cfg);
|
||||
```
|
||||
|
||||
This creates a "Mock" instance which holds the graph in memory.
|
||||
You can now use all the Blueprints and AccumuloGraph-specific functionality
|
||||
with this in-memory graph. This is useful for getting familiar
|
||||
with AccumuloGraph's functionality, or for testing or prototyping
|
||||
purposes.
|
||||
|
||||
To use an actual Accumulo cluster, use the following:
|
||||
```java
|
||||
Configuration cfg = new AccumuloGraphConfiguration()
|
||||
.setInstanceType(InstanceType.Distributed)
|
||||
.setZooKeeperHosts("zookeeper-host")
|
||||
.setInstanceName("instance-name")
|
||||
.setUser("user").setPassword("password")
|
||||
.setGraphName("graph")
|
||||
.setCreate(true);
|
||||
return GraphFactory.open(cfg);
|
||||
```
|
||||
|
||||
This directs AccumuloGraph to use a "Distributed" Accumulo
|
||||
instance, and sets the appropriate ZooKeeper parameters,
|
||||
instance name, and authentication information, which correspond
|
||||
to the usual Accumulo connection settings. The graph name is
|
||||
used to create several backing tables in Accumulo, and the
|
||||
`setCreate` option tells AccumuloGraph to create the backing
|
||||
tables if they don't already exist.
|
||||
|
||||
|
||||
## Improving Performance
|
||||
|
||||
This section describes various configuration parameters that
|
||||
greatly enhance AccumuloGraph's performance. Brief descriptions
|
||||
of each option are provided here, but refer to the
|
||||
`AccumuloGraphConfiguration` Javadoc for fuller explanations.
|
||||
|
||||
### Disable consistency checks
|
||||
|
||||
The Blueprints API specifies a number of consistency checks for
|
||||
various operations, and requires errors if they fail. Some examples
|
||||
of invalid operations include adding a vertex with the same id as an
|
||||
existing vertex, adding edges between nonexistent vertices,
|
||||
and setting properties on nonexistent elements.
|
||||
Unfortunately, checking the above constraints for an
|
||||
Accumulo installation entails significant performance issues,
|
||||
since these require extra traffic to Accumulo using inefficient
|
||||
non-batched access patterns.
|
||||
|
||||
To remedy these performance issues, AccumuloGraph exposes
|
||||
several options to disable various of the above checks.
|
||||
These include:
|
||||
* `setAutoFlush` - to disable automatically flushing
|
||||
changes to the backing Accumulo tables
|
||||
* `setSkipExistenceChecks` - to disable element
|
||||
existence checks, avoiding trips to the Accumulo cluster
|
||||
* `setIndexableGraphDisabled` - to disable
|
||||
indexing functionality, which improves performance
|
||||
of element removal
|
||||
|
||||
### Set Accumulo performance parameters
|
||||
|
||||
Accumulo itself features a number of performance-related parameters,
|
||||
and we allow configuration of these. Generally, these relate to
|
||||
write buffer sizes, multithreading, etc. The settings include:
|
||||
* `setMaxWriteLatency` - max time prior to flushing
|
||||
element write buffer
|
||||
* `setMaxWriteMemory` - max size for element write buffer
|
||||
* `setMaxWriteThreads` - max threads used for element writing
|
||||
* `setMaxWriteTimeout` - max time to wait before failing
|
||||
element buffer writes
|
||||
* `setQueryThreads` - number of query threads to use
|
||||
for fetching elements, properties etc.
|
||||
|
||||
### Caching and preloading
|
||||
|
||||
AccumuloGraph contains a number of
|
||||
|
||||
* `setPropertyCacheTimeout`
|
||||
|
||||
* `setEdgeCacheParams`
|
||||
* `setVertexCacheParams`
|
||||
|
||||
* `setPreloadedEdgeLabels`
|
||||
* `setPreloadedProperties`
|
||||
|
||||
|
||||
## Bulk Ingest
|
||||
|
||||
|
||||
|
||||
## Hadoop Integration
|
||||
|
||||
|
||||
## Table Structure
|
||||
|
||||
|
||||
|
||||
##Code Examples
|
||||
###Creating a new or connecting to an existing distributed graph
|
||||
```java
|
||||
|
||||
Reference in New Issue
Block a user