3.3 KiB
AccumuloGraph Table Schema
AccumuloGraph uses a number of backing tables for data storage. This file documents the structure and organization of these tables. The tables and records are structured in such a way as to implement the Blueprints operations in terms of efficient Accumulo operations, i.e., prefix searches, contiguous scans, and batch operations. For our purposes, Accumulo entries consist of four fields: row (R), column family (CF), column qualifier (CQ), and value (V). For more information on Accumulo's schema and how to access records efficiently, see the Accumulo documentation.
The tables used by AccumuloGraph are prefixed by the configured graph name and can be classed as element/property tables, and index tables. Their structure is discussed below.
Elements and properties
Vertex and edge information, along with their properties, are stored in the graphname_vertex and graphname_edge tables respectively.
First, an entry declares the existence of a vertex in the vertex table.
| R | CF | CQ | V |
|---|---|---|---|
| vertex_id | _LABEL_ |
_EXISTS_ |
[empty] |
A similar entry declares the existence of an edge in the edge table.
| R | CF | CQ | V |
|---|---|---|---|
| edge_id | _LABEL_ |
in_vertex_id_DELIM_out_vertex_id |
edge_label |
When adding an edge, additional entries are stored in the
vertex table for each endpoint of the edge. These facilitate the
Vertex.getEdge and Vertex.getVertex operations.
| R | CF | CQ | V |
|---|---|---|---|
| in_vertex_id | _IN_EDGE_ |
out_vertex_id_DELIM_edge_id |
edge_label |
| out_vertex_id | _OUT_EDGE_ |
in_vertex_id_DELIM_edge_id |
edge_label |
Finally, vertex and edge properties are stored in their respective tables. Entry formats are the same for both vertices and edges. Note that property values are serialized such that their type can be deduced when deserializing.
| R | CF | CQ | V |
|---|---|---|---|
| element_id | property_key | [empty] | property_value |
Indexes
Several tables store index-related information, including index value tables that store index property keys and values, and index metadata tables that store information about what indexes exist and what properties are indexed.
For KeyIndexableGraph, index value tables
include graphname_vertex_key_index
and graphname_edge_key_index for vertex
and edge properties, respectively.
For IndexableGraph, index value tables are
named graphname_index_indexname,
where indexname is the index name.
The entry formats in all these tables are the same:
| R | CF | CQ | V |
|---|---|---|---|
| property_value | property_key | element_id | [empty] |
Property values are serialized in the same way as above.
Two index metadata tables store index information.
For KeyIndexableGraph, graphname_indexed_keys
enumerates the property keys that are indexed
in graphname_vertex_key_index and
graphname_edge_key_index.
| R | CF | CQ | V |
|---|---|---|---|
| property_key | element_class | [empty] | [empty] |
For IndexableGraph, graphname_index_names
lists the existing indexes.
| R | CF | CQ | V |
|---|---|---|---|
| index_name | element_class | [empty] | [empty] |