The British intelligence agency GCHQ has released the Gaffer database as an open source project.
Gaffer is sort of database written in Java that makes it “easy to store large-scale graphs in which the nodes and edges have statistics such as counts, histograms and sketches.”, its code is available for download on the code-sharing website Github.
“Gaffer is a framework that makes it easy to store large-scale graphs in which the nodes and edges have statistics such as counts, histograms and sketches. These statistics summarise the properties of the nodes and edges over time windows, and they can be dynamically updated over time.” states its description on the Github.
In reality the Gaffer is much more, it implements a framework for creating mass-scale databases, it is a powerful tool for the storage and analysis of the relationships between different pieces of data.
“Gaffer is a graph database, rather than a graph processing system. It is optimised for retrieving data on nodes of interest.” continues the description “Gaffer is distinguished from other graph storage systems by its ability to update properties within the store itself.”
The Gaffer implements features to carry out our several :
- Allow the creation of graphs with summarised properties within Accumulo with a very less amount of coding.
- Allow flexibility of stats that describe the entities and edges.
- Allow easy addition of nodes and edges.
- Allow quicker retrieval of data on nodes of interest.
- Deal with data of different security levels – all data has a visibility, which is used to restrict who can access data based on their authorizations.
- Support automatic age-off of data.
Gaffer is based on the Apache Accumulo that is a computer software project that developed a sorted, distributed key/value store based on the BigTable technology developed by Google.
Accumulo was created in 2008 by the US National Security Agency and it is released under the Apache 2.0 license.
Gaffer is distributed under the Apache 2.0 license that allows anyone to modify or distribute it.
Security experts speculate that Gaffer is used by the GCHQ for analyzing data related to a specific entity that could be a terrorist or any other element under investigation.
“Each node might be a surveilled terrorist or other source of data, and analysis of the graph might then show who or what is at the ‘center’ of that network,” said Andrii Degeler, a journalist at Ars Technica.
It is impossible to understand the motivation behind the release of the platform, but it is likely that the agency is trying to be attractive for young talents in the hacker community.
The GCHQ is currently working on Gaffer 2, as reported on Github:
“The version of Gaffer in this repo is no longer under active development because a project called Gaffer2 is in development. “