Initial commit of table structure documentation; still needs work

This commit is contained in:
Michael Lieberman
2015-02-06 15:24:07 -05:00
parent 93c9c37936
commit f745ff23fb

View File

@@ -1,34 +1,64 @@
Table Structure
---------------
AccumuloGraph Table Schema
==========================
This file documents the structure of backend tables
used for storing elements, properties, etc.
AccumuloGraph uses a number of backing tables for data storage.
This file documents the structure and organization of these
tables. The tables and records are structured in such a way as
to implement the Blueprints operations in terms of efficient Accumulo
operations, i.e., prefix searches, contiguous scans, and
batch operations.
For our purposes, Accumulo entries consist of four fields:
row (R), column family (CF), column qualifier (CQ), and value (V).
For more information on Accumulo's schema and how to
access records efficiently, see the
[Accumulo documentation](https://accumulo.apache.org/1.5/accumulo_user_manual.html).
**Note (29 Dec 2014): This documentation is out of date.**
The tables used by AccumuloGraph are prefixed by the configured graph
name and can be classed as element/property tables, and
index tables. Their structure is discussed below.
## Vertex Table
Elements and properties
-----------------------
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
VertexID | Label Flag | Exists Flag | [empty]
VertexID | INVERTEX | OutVertexID_EdgeID | Edge Label
VertexID | OUTVERTEX | InVertexID_EdgeID | Edge Label
VertexID | Property Key | [empty] | Serialized Value
Vertex and edge information, along with their properties, are stored
in the *graphname*\_vertex and *graphname*\_edge tables respectively.
## Edge Table
| R | CF | CQ | V |
|---|----|----|---|
| *vertex_id* | `_LABEL_` | `_EXISTS_`| *[empty]* |
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
EdgeID|Label Flag|InVertexID_OutVertexID|Edge Label
EdgeID|Property Key|[empty]|Serialized Value
| R | CF | CQ | V |
|---|----|----|---|
| *edge_id* | `_LABEL_` | *in_vertex_id*`_DELIM_`*out_vertex_id* | *edge_label* |
## Edge/Vertex Index
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Serialized Value|Property Key|VertexID/EdgeID|[empty]
| R | CF | CQ | V |
|---|----|----|---|
| *in_vertex_id* | `_IN_EDGE_` | *out_vertex_id*`_DELIM_`*edge_id* | *edge_label* |
| *out_vertex_id* | `_OUT_EDGE_` | *in_vertex_id*`_DELIM_`*edge_id* | *edge_label* |
## Metadata Table
| R | CF | CQ | V |
|---|----|----|---|
| *element_id* | *property_key* | *[empty]* | *property_value* |
Row ID | Column Family | Column Qualifier | Value
---|---|---|---
Index Name| Index Class |[empty]|[empty]
Indexes
-------
Relevant table names:
1. *graphname*\_vertex\_key\_index - Vertex property key index
2. *graphname*\_edge\_key\_index - Edge property key index
3. *graphname*\_indexed\_keys - List of indexed keys
4. *graphname*\_index\_names - List of named index names
5. *graphname*\_index\_*indexname* - Named index
| R | CF | CQ | V |
|---|----|----|---|
| *property_value* | *property_key* | *element_id* | *[empty]* |
| R | CF | CQ | V |
|---|----|----|---|
| *property_key* | *element_class* | *[empty]* | *[empty]* |
| R | CF | CQ | V |
|---|----|----|---|
| *index_name* | *element_class* | *[empty]* | *[empty]* |