Graph &

Data Analytics

Graph &

Data Analytics

Bringing understanding

to large, disparate data

Bringing understanding

to large, disparate data

Data essential for examining hard scientific and national security questions are often large, messy, or incomplete. Frequently, these data exist in networks or graphs that grow more complex as the amount of data grows.

Take the energy sector, for example. Technologies continually evolve and new sources of energy generation come to fruition. These rapid changes create challenges for data integration and understanding.

Likewise, related data carried over digital networks can be nearly impossible to connect. These data can be captured or conveyed with graphs, but at a very high level.

Our researchers are pioneering data and graph analytics using novel visualization and machine learning techniques to tease out data connections.

## Data integration across devices and networks

PNNL researchers developed the VOLTTRON^{TM} software platform to integrate smart device operations and communications over the power grid. The platform’s analytical environment seamlessly and securely connects a wide range of data and devices to make automatic decisions based on user needs and preferences. When used in building systems to manage energy consumption, VOLTTRON improves overall system performance and creates a more flexible and reliable grid.

Our researchers are also developing scalable graph platforms to understand network structures and extract actionable information embedded in diverse data sets. Ripples, a first-of-its-kind network analysis tool, can solve complex graph analytics problems in less than a minute on a high-performance computing platform. Grappolo can perform blazing fast graph clustering (community detection) on graphs with millions to billions of nodes. Vite, a distributed version of Grappolo, can scale computation on leadership-class machines for graphs with tens to hundreds of billions of edges, with demonstrated performance on modern multi-GPU systems.

We are also developing tools to help analysts understand and interpret data contained in complex graphs. While typical graphs model binary relationships, hypergraphs model multi-way relationships, exposing the interconnectedness of the data without artificially generating two-way relationships. HyperNetX(HNX) is an open-source Python library developed to analyze and visualize multi-way relationships modeled as hypergraphs. These relationships can be found in cyber data, protein pathways, bibliographic networks, and social media where interactions involve multiple entities simultaneously.

PNNL has a deep understanding of the mathematical principles that govern the digital landscape. Our data and graph analytics technologies have been deployed in domains including threat detection for national security, cyber analytics, scientific computing, intellectual property portfolio analysis, energy grid reliability, environmental safety, training, and law enforcement. Further, in collaboration with joint institutes such as the University of Washington, our researchers discovering innovative solutions to complex analytic challenges.