Merge branch 'master' into table-wrappers

This commit is contained in:
Michael Lieberman
2014-12-29 15:52:12 -05:00
11 changed files with 568 additions and 182 deletions

292
README.md
View File

@@ -4,140 +4,210 @@ AccumuloGraph
This is an implementation of the [TinkerPop Blueprints](http://tinkerpop.com)
2.6 API using [Apache Accumulo](http://apache.accumulo.com) as the backend.
This implementation provides easy to use, easy to write, and easy to read
access to an arbitrarily large graph that is stored in Accumulo.
We implement the following Blueprints interfaces:
<br>1. Graph
<br>2. KeyIndexableGraph
<br>3. IndexableGraph
Please feel free to submit issues for any bugs you find or features you want.
We are open to pull requests from your forks also.
This combines the many benefits and flexibility of Blueprints
with the scalability and performance of Accumulo.
##Usage
In addition to the basic Blueprints functionality, we provide a number
of enhanced features, including:
* Indexing implementations via `IndexableGraph` and `KeyIndexableGraph`
* Support for mock, mini, and distributed instances of Accumulo
* Numerous performance tweaks and configuration parameters
* Support for high speed ingest
* Hadoop integration
Feel free to contact us with bugs, suggestions, pull requests,
or simply how you are leveraging AccumuloGraph in your own work.
## Getting Started
First, include AccumuloGraph as a Maven dependency. Releases are deployed
to Maven Central.
The releases are currently stored in Maven Central.
```xml
<dependency>
<groupId>edu.jhuapl.tinkerpop</groupId>
<artifactId>blueprints-accumulo-graph</artifactId>
<version>0.0.2</version>
<version>0.0.3-SNAPSHOT</version>
</dependency>
```
For non-Maven users, the binaries can be found in the releases section in this
For non-Maven users, the binary jars can be found in the releases section in this
GitHub repository, or you can get them from Maven Central.
##Code Examples
###Creating a new or connecting to an existing distributed graph
Creating an `AccumuloGraph` involves setting a few parameters in an
`AccumuloGraphConfiguration` object, and opening the graph.
The defaults are sensible for using an Accumulo cluster.
We provide some simple examples below. Javadocs for
`AccumuloGraphConfiguration` explain all the other parameters
in more detail.
First, to instantiate an in-memory graph:
```java
Configuration cfg = new AccumuloGraphConfiguration()
.setInstanceName("accumulo").setUser("user").setZookeeperHosts("zk1")
.setPassword("password".getBytes()).setGraphName("myGraph");
Graph graph = GraphFactory.open(cfg.getConfiguration());
.setInstanceType(InstanceType.Mock)
.setGraphName("graph");
return GraphFactory.open(cfg);
```
###Creating a new Mock Graph
Setting the instance type to mock allows for in-memory processing with a MockAccumulo instance.<br>
There is also support for Mini Accumulo.
This creates a "Mock" instance which holds the graph in memory.
You can now use all the Blueprints and AccumuloGraph-specific functionality
with this in-memory graph. This is useful for getting familiar
with AccumuloGraph's functionality, or for testing or prototyping
purposes.
To use an actual Accumulo cluster, use the following:
```java
Configuration cfg = new AccumuloGraphConfiguration().setInstanceType(InstanceType.Mock)
.setGraphName("myGraph");
Graph graph = GraphFactory.open(cfg);
Configuration cfg = new AccumuloGraphConfiguration()
.setInstanceType(InstanceType.Distributed)
.setZooKeeperHosts("zookeeper-host")
.setInstanceName("instance-name")
.setUser("user").setPassword("password")
.setGraphName("graph")
.setCreate(true);
return GraphFactory.open(cfg);
```
###Accessing a graph
This directs AccumuloGraph to use a "Distributed" Accumulo
instance, and sets the appropriate ZooKeeper parameters,
instance name, and authentication information, which correspond
to the usual Accumulo connection settings. The graph name is
used to create several backing tables in Accumulo, and the
`setCreate` option tells AccumuloGraph to create the backing
tables if they don't already exist.
AccumuloGraph also has limited support for a "Mini" instance
of Accumulo.
## Improving Performance
This section describes various configuration parameters that
greatly enhance AccumuloGraph's performance. Brief descriptions
of each option are provided here, but refer to the
`AccumuloGraphConfiguration` Javadoc for fuller explanations.
### Disable consistency checks
The Blueprints API specifies a number of consistency checks for
various operations, and requires errors if they fail. Some examples
of invalid operations include adding a vertex with the same id as an
existing vertex, adding edges between nonexistent vertices,
and setting properties on nonexistent elements.
Unfortunately, checking the above constraints for an
Accumulo installation entails significant performance issues,
since these require extra traffic to Accumulo using inefficient
non-batched access patterns.
To remedy these performance issues, AccumuloGraph exposes
several options to disable various of the above checks.
These include:
* `setAutoFlush` - to disable automatically flushing
changes to the backing Accumulo tables
* `setSkipExistenceChecks` - to disable element
existence checks, avoiding trips to the Accumulo cluster
* `setIndexableGraphDisabled` - to disable
indexing functionality, which improves performance
of element removal
### Tweak Accumulo performance parameters
Accumulo itself features a number of performance-related parameters,
and we allow configuration of these. Generally, these relate to
write buffer sizes, multithreading, etc. The settings include:
* `setMaxWriteLatency` - max time prior to flushing
element write buffer
* `setMaxWriteMemory` - max size for element write buffer
* `setMaxWriteThreads` - max threads used for element writing
* `setMaxWriteTimeout` - max time to wait before failing
element buffer writes
* `setQueryThreads` - number of query threads to use
for fetching elements, properties etc.
### Enable edge and property preloading
As a performance tweak, AccumuloGraph performs lazy loading of
properties and edges. This means that an operation such as
`getVertex` does not by default populate the returned
vertex object with the associated vertex's properties
and edges. Instead, they are initialized only when requested via
`getProperty`, `getEdges`, etc. These are useful
for use cases where you won't be accessing many of these
properties. However, if certain properties or edges will
be accessed frequently, you can set options for preloading
these specific properties and edges, which will be more
efficient than on-the-fly loading. These options include:
* `setPreloadedProperties` - set property keys
to be preloaded
* `setPreloadedEdgeLabels` - set edges to be
preloaded based on their labels
### Enable caching
AccumuloGraph contains a number of caching options
that mitigate the need for Accumulo traffic for recently-accessed
elements. The following options control caching:
* `setVertexCacheParams` - size and expiry for vertex cache
* `setEdgeCacheParams` - size and expiry for edge cache
* `setPropertyCacheTimeout` - property expiry time,
which can be specified globally and/or for individual properties
## High Speed Ingest
One of Accumulo's key advantages is its ability for high-speed ingest
of huge amounts of data. To leverage this ability, we provide
an additional `AccumuloBulkIngester` class that
exchanges consistency guarantees for high speed ingest.
The following is an example of how to use the bulk ingester to
ingest a simple graph:
```java
Vertex v1 = graph.addVertex("1");
v1.setProperty("name", "Alice");
Vertex v2 = graph.addVertex("2");
v2.setProperty("name", "Bob");
Edge e1 = graph.addEdge("E1", v1, v2, "knows");
e1.setProperty("since", new Date());
```
###Creating indexes
```java
((KeyIndexableGraph)graph)
.createKeyIndex("name", Vertex.class);
AccumuloGraphConfiguration cfg = ...;
AccumuloBulkIngester ingester = new AccumuloBulkIngester(cfg);
// Add a vertex.
ingester.addVertex("A").finish();
// Add another vertex with properties.
ingester.addVertex("B")
.add("P1", "V1").add("P2", "V2")
.finish();
// Add an edge.
ingester.addEdge("A", "B", "edge").finish();
// Shutdown and compact tables.
ingester.shutdown(true);
```
###MapReduce Integration
####In the tool
See the Javadocs for more details.
Note that you are responsible for ensuring that data is entered
in a consistent way, or the resulting graph will
have undefined behavior.
## Hadoop Integration
AccumuloGraph features Hadoop integration via custom input and output
format implementations. `VertexInputFormat` and `EdgeInputFormat`
allow vertex and edge inputs to mappers, respectively. Use as follows:
```java
AccumuloConfiguration cfg = new AccumuloGraphConfiguration()
.setInstanceName("accumulo").setZookeeperHosts("zk1").setUser("root")
.setPassword("secret".getBytes()).setGraphName("myGraph");
AccumuloGraphConfiguration cfg = ...;
// For vertices:
Job j = new Job();
j.setInputFormatClass(VertexInputFormat.class);
VertexInputFormat.setAccumuloGraphConfiguration(j,
cfg.getConfiguration());
VertexInputFormat.setAccumuloGraphConfiguration(j, cfg);
// For edges:
Job j = new Job();
j.setInputFormatClass(EdgeInputFormat.class);
EdgeInputFormat.setAccumuloGraphConfiguration(j, cfg);
```
####In the mapper
`ElementOutputFormat` allows writing to an AccumuloGraph from
reducers. Use as follows:
```java
public void map(Text k, Vertex v, Context c) {
System.out.println(v.getId().toString());
}
```
##Table Design
###Vertex Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
VertexID | Label Flag | Exists Flag | [empty]
VertexID | INVERTEX | OutVertexID_EdgeID | Edge Label
VertexID | OUTVERTEX | InVertexID_EdgeID | Edge Label
VertexID | Property Key | [empty] | Serialized Value
###Edge Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
EdgeID|Label Flag|InVertexID_OutVertexID|Edge Label
EdgeID|Property Key|[empty]|Serialized Value
###Edge/Vertex Index
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Serialized Value|Property Key|VertexID/EdgeID|[empty]
###Metadata Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Index Name| Index Class |[empty]|[empty]
##Advanced Configuration
###Graph Configuration
- setGraphName(String name)
- setCreate(boolean create) - Sets if the backing graph tables should be created if they do not exist.
- setClear(boolean clear) - Sets if the backing graph tables should be reset if they exist.
- autoFlush(boolean autoFlush) - Sets if each graph element and property change will be flushed to the server.
- skipExistenceChecks(boolean skip) - Sets if you want to skip existance checks when creating graph elemenets.
- setAutoIndex(boolean ison) - Turns on/off automatic indexing.
###Accumulo Control
- setUser(String user) - Sets the user to use when connecting to Accumulo
- setPassword(byte[] password | String password) - Sets the password to use when connecting to Accumulo
- setZookeeperHosts(String zookeeperHosts) - Sets the Zookeepers to connect to.
- setInstanceName(String instance) - Sets the Instance name to use when connecting to Zookeeper
- setInstanceType(InstanceType type) - Sets the type of Instance to use : Distrubuted, Mini, or Mock. Defaults to Distrubuted
- setQueryThreads(int threads) - Specifies the number of threads to use in scanners. Defaults to 3
- setMaxWriteLatency(long latency) - Sets the latency to be used for all writes to Accumulo
- setMaxWriteTimeout(long timeout) - Sets the timeout to be used for all writes to Accumulo
- setMaxWriteMemory(long mem) - Sets the memory buffer to be used for all writes to Accumulo
- setMaxWriteThreads(int threads) - Sets the number of threads to be used for all writes to Accumulo
- setAuthorizations(Authorizations auths) - Sets the authorizations to use when accessing the graph
- setColumnVisibility(ColumnVisibility colVis) - TODO
- setSplits(String splits | String[] splits) - Sets the splits to use when creating tables. Can be a space sperated list or an array of splits
- setMiniClusterTempDir(String dir) - Sets directory to use as the temp directory for the Mini cluster
###Caching
- setLruMaxCapacity(int max) - TODO
- setVertexCacheTimeout(int millis) - Sets the vertex cache timeout. A value <=0 clears the value
- setEdgeCacheTimeout(int millis) - Sets the edge cache timeout. A value <=0 clears the value
###Preloading
- setPropertyCacheTimeout(int millis) - Sets the element property cache timeout. A value <=0 clears the value
- setPreloadedProperties(String[] propertyKeys) - Sets the property keys that should be preloaded. Requiers a positive timout.
- setPreloadedEdgeLabels(String[] edgeLabels) - TODO
AccumuloGraphConfiguration cfg = ...;
Job j = new Job();
j.setOutputFormatClass(ElementOutputFormat.class);
ElementOutputFormat.setAccumuloGraphConfiguration(j, cfg);
```

View File

@@ -0,0 +1,59 @@
/******************************************************************************
* COPYRIGHT NOTICE *
* Copyright (c) 2014 The Johns Hopkins University/Applied Physics Laboratory *
* All rights reserved. *
* *
* This material may only be used, modified, or reproduced by or for the *
* U.S. Government pursuant to the license rights granted under FAR clause *
* 52.227-14 or DFARS clauses 252.227-7013/7014. *
* *
* For any other permissions, please contact the Legal Office at JHU/APL. *
******************************************************************************/
package edu.jhuapl.tinkerpop;
import com.tinkerpop.blueprints.Features;
/**
* {@link Features} creator.
*/
public class AccumuloFeatures {
public static Features get() {
Features f = new Features();
// For simplicity, I accept all property types. They are handled in not the
// best way. To be fixed later.
f.ignoresSuppliedIds = true;
f.isPersistent = true;
f.isWrapper = false;
f.supportsBooleanProperty = true;
f.supportsDoubleProperty = true;
f.supportsDuplicateEdges = true;
f.supportsEdgeIndex = true;
f.supportsEdgeIteration = true;
f.supportsEdgeRetrieval = true;
f.supportsEdgeKeyIndex = true;
f.supportsEdgeProperties = true;
f.supportsFloatProperty = true;
f.supportsIndices = true;
f.supportsIntegerProperty = true;
f.supportsKeyIndices = true;
f.supportsLongProperty = true;
f.supportsMapProperty = true;
f.supportsMixedListProperty = true;
f.supportsPrimitiveArrayProperty = true;
f.supportsSelfLoops = true;
f.supportsSerializableObjectProperty = true;
f.supportsStringProperty = true;
f.supportsThreadedTransactions = false;
f.supportsTransactions = false;
f.supportsUniformListProperty = true;
f.supportsVertexIndex = true;
f.supportsVertexIteration = true;
f.supportsVertexKeyIndex = true;
f.supportsVertexProperties = true;
f.supportsThreadIsolatedTransactions = false;
return f;
}
}

View File

@@ -294,46 +294,9 @@ public class AccumuloGraph implements Graph, KeyIndexableGraph, IndexableGraph {
// End Aliases
// For simplicity, I accept all property types. They are handled in not the
// best way. To be fixed later
Features f;
@Override
public Features getFeatures() {
if (f == null) {
f = new Features();
f.ignoresSuppliedIds = true;
f.isPersistent = true;
f.isWrapper = false;
f.supportsBooleanProperty = true;
f.supportsDoubleProperty = true;
f.supportsDuplicateEdges = true;
f.supportsEdgeIndex = true;
f.supportsEdgeIteration = true;
f.supportsEdgeRetrieval = true;
f.supportsEdgeKeyIndex = true;
f.supportsEdgeProperties = true;
f.supportsFloatProperty = true;
f.supportsIndices = true;
f.supportsIntegerProperty = true;
f.supportsKeyIndices = true;
f.supportsLongProperty = true;
f.supportsMapProperty = true;
f.supportsMixedListProperty = true;
f.supportsPrimitiveArrayProperty = true;
f.supportsSelfLoops = true;
f.supportsSerializableObjectProperty = true;
f.supportsStringProperty = true;
f.supportsThreadedTransactions = false;
f.supportsTransactions = false;
f.supportsUniformListProperty = true;
f.supportsVertexIndex = true;
f.supportsVertexIteration = true;
f.supportsVertexKeyIndex = true;
f.supportsVertexProperties = true;
f.supportsThreadIsolatedTransactions = false;
}
return f;
return AccumuloFeatures.get();
}
@Override
@@ -368,12 +331,7 @@ public class AccumuloGraph implements Graph, KeyIndexableGraph, IndexableGraph {
if (id == null) {
throw ExceptionFactory.vertexIdCanNotBeNull();
}
String myID;
try {
myID = (String) id;
} catch (ClassCastException e) {
return null;
}
String myID = id.toString();
if (vertexCache != null) {
Vertex vertex = vertexCache.retrieve(myID);
@@ -645,16 +603,10 @@ public class AccumuloGraph implements Graph, KeyIndexableGraph, IndexableGraph {
@Override
public Edge getEdge(Object id) {
String myID;
if (id == null) {
throw ExceptionFactory.edgeIdCanNotBeNull();
} else {
try {
myID = (String) id;
} catch (ClassCastException e) {
return null;
}
}
String myID = id.toString();
if (edgeCache != null) {
Edge edge = edgeCache.retrieve(myID);

View File

@@ -18,7 +18,6 @@ import java.io.File;
import java.io.IOException;
import java.io.Serializable;
import java.lang.reflect.Field;
import java.nio.ByteBuffer;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
@@ -39,6 +38,7 @@ import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.security.ColumnVisibility;
import org.apache.accumulo.minicluster.MiniAccumuloCluster;
import org.apache.commons.configuration.AbstractConfiguration;
import org.apache.commons.configuration.Configuration;
import org.apache.commons.configuration.PropertiesConfiguration;
import org.apache.hadoop.io.Text;
@@ -52,7 +52,8 @@ import com.tinkerpop.blueprints.KeyIndexableGraph;
* Setters return the same configuration instance to
* ease chained setting of parameters.
*/
public class AccumuloGraphConfiguration implements Serializable {
public class AccumuloGraphConfiguration extends AbstractConfiguration
implements Serializable {
private static final long serialVersionUID = 7024072260167873696L;
@@ -91,7 +92,7 @@ public class AccumuloGraphConfiguration implements Serializable {
/**
* Utility class gathering valid configuration keys.
*/
private static class Keys {
static class Keys {
public static final String GRAPH_CLASS = "blueprints.graph";
public static final String ZK_HOSTS = "blueprints.accumulo.zkhosts";
public static final String INSTANCE = "blueprints.accumulo.instance";
@@ -222,10 +223,14 @@ public class AccumuloGraphConfiguration implements Serializable {
*/
public AccumuloGraphConfiguration setInstanceType(InstanceType type) {
conf.setProperty(Keys.INSTANCE_TYPE, type.toString());
if (type.equals(InstanceType.Mock)) {
if (InstanceType.Mock.equals(type) ||
InstanceType.Mini.equals(type)) {
setUser("root");
setPassword("");
setCreate(true);
}
return this;
}
@@ -272,8 +277,8 @@ public class AccumuloGraphConfiguration implements Serializable {
return this;
}
public ByteBuffer getPassword() {
return ByteBuffer.wrap(conf.getString(Keys.PASSWORD).getBytes());
public String getPassword() {
return conf.getString(Keys.PASSWORD);
}
/**
@@ -679,14 +684,30 @@ public class AccumuloGraphConfiguration implements Serializable {
/**
* Name of the graph to create. Storage tables will be prefixed with this value.
* <p/>Note: Accumulo only allows table names with alphanumeric and underscore
* characters.
* @param name
* @return
*/
public AccumuloGraphConfiguration setGraphName(String name) {
if (!isValidGraphName(name)) {
throw new IllegalArgumentException("Invalid graph name."
+ " Only alphanumerics and underscores are allowed");
}
conf.setProperty(Keys.GRAPH_NAME, name);
return this;
}
/**
* Make sure this is a valid graph name because of restrictions
* on table names.
* @param name
*/
private static boolean isValidGraphName(String name) {
return name.matches("^[A-Za-z0-9_]+$");
}
public String[] getPreloadedProperties() {
if (conf.containsKey(Keys.PRELOAD_PROPERTIES)) {
return conf.getStringArray(Keys.PRELOAD_PROPERTIES);
@@ -881,6 +902,14 @@ public class AccumuloGraphConfiguration implements Serializable {
.setTimeout(getMaxWriteTimeout(), TimeUnit.MILLISECONDS);
}
/**
* File-based version of {@link #setMiniClusterTempDir(String)}.
* @param miniClusterTempDir
*/
public AccumuloGraphConfiguration setMiniClusterTempDir(File miniClusterTempDir) {
return setMiniClusterTempDir(miniClusterTempDir.getPath());
}
/**
* Used by JUnit Tests to set the miniClusterTempDirectory.
* If not set in advance of a test, getConnector will use a
@@ -888,8 +917,9 @@ public class AccumuloGraphConfiguration implements Serializable {
*
* @param miniClusterTempDir
*/
public void setMiniClusterTempDir(String miniClusterTempDir) {
public AccumuloGraphConfiguration setMiniClusterTempDir(String miniClusterTempDir) {
this.miniClusterTempDir = miniClusterTempDir;
return this;
}
public String getVertexTable() {
@@ -932,10 +962,11 @@ public class AccumuloGraphConfiguration implements Serializable {
case Distributed:
checkPropertyValue(Keys.ZK_HOSTS, getZooKeeperHosts(), false);
checkPropertyValue(Keys.USER, getUser(), false);
checkPropertyValue(Keys.PASSWORD, getPassword(), false);
// no break intentional
case Mini:
checkPropertyValue(Keys.INSTANCE, getInstanceName(), false);
checkPropertyValue(Keys.PASSWORD, new String(getPassword().array()), true);
checkPropertyValue(Keys.PASSWORD, getPassword(), true);
// no break intentional
case Mock:
checkPropertyValue(Keys.GRAPH_NAME, getGraphName(), false);
@@ -984,16 +1015,7 @@ public class AccumuloGraphConfiguration implements Serializable {
public void print() {
System.out.println(AccumuloGraphConfiguration.class+":");
Set<String> keys = new TreeSet<String>();
for (Field field : Keys.class.getDeclaredFields()) {
try {
keys.add((String) field.get(null));
} catch (Exception e) {
throw new RuntimeException(e);
}
}
for (String key : keys) {
for (String key : getValidInternalKeys()) {
String value = "(null)";
if (conf.containsKey(key)) {
value = conf.getProperty(key).toString();
@@ -1002,6 +1024,23 @@ public class AccumuloGraphConfiguration implements Serializable {
}
}
/**
* Get the AccumuloGraph-specific keys for this configuration.
* @return
*/
static Set<String> getValidInternalKeys() {
Set<String> keys = new TreeSet<String>();
for (Field field : Keys.class.getDeclaredFields()) {
try {
keys.add((String) field.get(null));
} catch (Exception e) {
throw new RuntimeException(e);
}
}
return keys;
}
// Old deprecated method names.
@@ -1118,4 +1157,37 @@ public class AccumuloGraphConfiguration implements Serializable {
public String[] getPreloadedEdges() {
return getPreloadedEdgeLabels();
}
// Abstract methods from the AbstractConfiguration implementation.
@Override
public boolean isEmpty() {
return conf.isEmpty();
}
@Override
public boolean containsKey(String key) {
return conf.containsKey(key);
}
@Override
public Object getProperty(String key) {
return conf.getProperty(key);
}
@Override
public Iterator<String> getKeys() {
return conf.getKeys();
}
@Override
protected void addPropertyDirect(String key, Object value) {
// Only allow AccumuloGraph-specific keys.
if (getValidInternalKeys().contains(key)) {
conf.setProperty(key, value);
} else {
throw new UnsupportedOperationException("Invalid key: "+key);
}
}
}

View File

@@ -52,7 +52,7 @@ public class ElementOutputFormat extends OutputFormat<NullWritable,Element> {
Configuration jobconf = job.getConfiguration();
jobconf.set(USER, acc.getUser());
jobconf.set(PASSWORD, new String(acc.getPassword().array()));
jobconf.set(PASSWORD, acc.getPassword());
jobconf.set(GRAPH_NAME, acc.getGraphName());
jobconf.set(INSTANCE, acc.getInstanceName());
jobconf.set(INSTANCE_TYPE, acc.getInstanceType().toString());

View File

@@ -0,0 +1,88 @@
/******************************************************************************
* COPYRIGHT NOTICE *
* Copyright (c) 2014 The Johns Hopkins University/Applied Physics Laboratory *
* All rights reserved. *
* *
* This material may only be used, modified, or reproduced by or for the *
* U.S. Government pursuant to the license rights granted under FAR clause *
* 52.227-14 or DFARS clauses 252.227-7013/7014. *
* *
* For any other permissions, please contact the Legal Office at JHU/APL. *
******************************************************************************/
package edu.jhuapl.tinkerpop;
import static org.junit.Assert.*;
import org.junit.Test;
import com.tinkerpop.blueprints.Edge;
import com.tinkerpop.blueprints.Graph;
import com.tinkerpop.blueprints.Vertex;
/**
* Tests related to Accumulo elements.
*/
public class AccumuloElementTest {
@Test
public void testNonStringIds() throws Exception {
Graph graph = AccumuloGraphTestUtils.makeGraph("nonStringIds");
Object[] ids = new Object[] {
10, 20, 30L, 40L,
50.0f, 60.0f, 70.0d, 80.0d,
(byte) 'a', (byte) 'b', 'c', 'd',
"str1", "str2",
new GenericObject("str3"), new GenericObject("str4"),
};
Object[] edgeIds = new Object[] {
100, 200, 300L, 400L,
500.0f, 600.0f, 700.0d, 800.0d,
(byte) 'e', (byte) 'f', 'g', 'h',
"str5", "str6",
new GenericObject("str7"), new GenericObject("str8"),
};
for (int i = 0; i < ids.length; i++) {
assertNull(graph.getVertex(ids[i]));
Vertex v = graph.addVertex(ids[i]);
assertNotNull(v);
assertNotNull(graph.getVertex(ids[i]));
}
assertEquals(ids.length, count(graph.getVertices()));
for (int i = 1; i < edgeIds.length; i++) {
assertNull(graph.getEdge(edgeIds[i-1]));
Edge e = graph.addEdge(edgeIds[i-1],
graph.getVertex(ids[i-1]),
graph.getVertex(ids[i]), "label");
assertNotNull(e);
assertNotNull(graph.getEdge(edgeIds[i-1]));
}
assertEquals(edgeIds.length-1, count(graph.getEdges()));
graph.shutdown();
}
private static int count(Iterable<?> iter) {
int count = 0;
for (@SuppressWarnings("unused") Object obj : iter) {
count++;
}
return count;
}
private static class GenericObject {
private final Object id;
public GenericObject(Object id) {
this.id = id;
}
@Override
public String toString() {
return "GenericObject [id=" + id + "]";
}
}
}

View File

@@ -22,6 +22,7 @@ import java.util.List;
import javax.xml.namespace.QName;
import org.apache.commons.configuration.Configuration;
import org.apache.hadoop.io.Text;
import org.junit.Test;
@@ -30,6 +31,21 @@ import com.tinkerpop.blueprints.Vertex;
public class AccumuloGraphConfigurationTest {
@Test
public void testConfigurationInterface() throws Exception {
Configuration conf = AccumuloGraphTestUtils.generateGraphConfig("setPropsValid");
for (String key : AccumuloGraphConfiguration.getValidInternalKeys()) {
// This is bad... but we should allow them if they are valid keys.
conf.setProperty(key, "value");
}
conf = AccumuloGraphTestUtils.generateGraphConfig("setPropsInvalid");
try {
conf.setProperty("invalidKey", "value");
fail();
} catch (Exception e) { }
}
@Test
public void testSplits() throws Exception {
AccumuloGraphConfiguration cfg;
@@ -215,4 +231,30 @@ public class AccumuloGraphConfigurationTest {
cfg.validate();
assertFalse(cfg.getEdgeCacheEnabled());
}
/**
* Test different kinds of graph names (hyphens, punctuation, etc).
* @throws Exception
*/
@Test
public void testGraphNames() throws Exception {
AccumuloGraphConfiguration conf = new AccumuloGraphConfiguration();
String[] valid = new String[] {
"alpha", "12345", "alnum12345",
"12345alnum", "under_score1", "_under_score_2"};
String[] invalid = new String[] {"hyph-en",
"dot..s", "quo\"tes"};
for (String name : valid) {
conf.setGraphName(name);
}
for (String name : invalid) {
try {
conf.setGraphName(name);
fail();
} catch (Exception e) { }
}
}
}

View File

@@ -14,6 +14,9 @@
*/
package edu.jhuapl.tinkerpop;
import com.tinkerpop.blueprints.Graph;
import com.tinkerpop.blueprints.GraphFactory;
import edu.jhuapl.tinkerpop.AccumuloGraphConfiguration.InstanceType;
public class AccumuloGraphTestUtils {
@@ -21,8 +24,12 @@ public class AccumuloGraphTestUtils {
public static AccumuloGraphConfiguration generateGraphConfig(String graphDirectoryName) {
AccumuloGraphConfiguration cfg = new AccumuloGraphConfiguration();
cfg.setInstanceName("instanceName").setZooKeeperHosts("ZookeeperHostsString");
cfg.setUser("root").setPassword("".getBytes());
cfg.setUser("root").setPassword("");
cfg.setGraphName(graphDirectoryName).setCreate(true).setAutoFlush(true).setInstanceType(InstanceType.Mock);
return cfg;
}
public static Graph makeGraph(String name) {
return GraphFactory.open(generateGraphConfig(name).getConfiguration());
}
}

View File

@@ -0,0 +1,39 @@
package edu.jhuapl.tinkerpop;
import org.apache.commons.configuration.Configuration;
import org.junit.Ignore;
import com.tinkerpop.blueprints.Graph;
import com.tinkerpop.blueprints.GraphFactory;
import edu.jhuapl.tinkerpop.AccumuloGraphConfiguration.InstanceType;
/**
* Run the test suite for a Distributed instance type.
* <p/>Note: This is disabled by default since we can't
* guarantee an Accumulo cluster setup in a test environment.
* To use this, set the constants below and remove
* the @Ignore annotation.
*/
@Ignore
public class DistributedInstanceTest extends AccumuloGraphTest {
// Connection constants.
private static final String ZOOKEEPERS = "zkhost-1,zkhost-2";
private static final String INSTANCE = "accumulo-instance";
private static final String USER = "user";
private static final String PASSWORD = "password";
@Override
public Graph generateGraph(String graphDirectoryName) {
Configuration cfg = new AccumuloGraphConfiguration()
.setInstanceType(InstanceType.Distributed)
.setZooKeeperHosts(ZOOKEEPERS)
.setInstanceName(INSTANCE)
.setUser(USER).setPassword(PASSWORD)
.setGraphName(graphDirectoryName)
.setCreate(true);
testGraphName.set(graphDirectoryName);
return GraphFactory.open(cfg);
}
}

View File

@@ -0,0 +1,23 @@
package edu.jhuapl.tinkerpop;
import org.apache.commons.configuration.Configuration;
import com.tinkerpop.blueprints.Graph;
import com.tinkerpop.blueprints.GraphFactory;
import edu.jhuapl.tinkerpop.AccumuloGraphConfiguration.InstanceType;
/**
* Run the test suite for a Mock instance type.
*/
public class MockInstanceTest extends AccumuloGraphTest {
@Override
public Graph generateGraph(String graphDirectoryName) {
Configuration cfg = new AccumuloGraphConfiguration()
.setInstanceType(InstanceType.Mock)
.setGraphName(graphDirectoryName);
testGraphName.set(graphDirectoryName);
return GraphFactory.open(cfg);
}
}

34
table-structure.md Normal file
View File

@@ -0,0 +1,34 @@
Table Structure
---------------
This file documents the structure of backend tables
used for storing elements, properties, etc.
**Note (29 Dec 2014): This documentation is out of date.**
## Vertex Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
VertexID | Label Flag | Exists Flag | [empty]
VertexID | INVERTEX | OutVertexID_EdgeID | Edge Label
VertexID | OUTVERTEX | InVertexID_EdgeID | Edge Label
VertexID | Property Key | [empty] | Serialized Value
## Edge Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
EdgeID|Label Flag|InVertexID_OutVertexID|Edge Label
EdgeID|Property Key|[empty]|Serialized Value
## Edge/Vertex Index
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Serialized Value|Property Key|VertexID/EdgeID|[empty]
## Metadata Table
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Index Name| Index Class |[empty]|[empty]