Big Data Based Management for Smart Grids
Atimad EI Khaouat*, Laila Benhlima*
* Mohammadia shool of engineers
Mohammed V university ofRabat
Morocco
[email protected], [email protected]
Abstract-Information and communication technologies have
crucial role with many researches improving existing electrical
grid. With the emergence of the internet of things and the
growing availability of connected devices such as smart meters
and other sensors, we are facing huge amount of data about
energy consumption, energy production and so on. In this
context, smart grid data management and analytics using big
data tools helps to manage this huge volume of collected data
from smart devices installed in the grid in order to extract
knowledge, define key performance indicators, forecast demand
response behavior, ... This paper aims at presenting a solution
for managing big data for smart grid to make it available for
high level applications. We propose a global solution architecture
with detail of its each component and explanation of data flow
and analysis using big data process.
Keywords-Smart Grid; Data Management; Analytics; Big Data;
IT Solution; Architecture
I. INTRODUCTTON
Smart grid is an automation system, established by
integrating large pool of sensors, smart meters, substations ...
in the existing power grids system for controlling and
monitoring it by information and communication technologies
[1 ].
All the intelligent devices produce different and
heterogeneous types of data: weather data, consumption data,
energy production data... This explosion in data reflects the
fact that a smart grid involves not just more detailed meter
information, but a wide range of intelligent devices and data
types that should be weil managed to take benefits from the
smart grid; better understand customer behavior, detect
outages, fraud or theft and more accurately forecast energy
demand. It requires the establishment of complex treatments
[2], due to:
• The nature of the data: time series, steam data...
• Their distributed nature and need of treatment at
different scales, since it comes from different sources.
• Real-time analytics for certain needs.
Data management design in any context should optimize
outcomes in two ways. First, it should extract clean, consistent
and infonnation that drives targeted benefits for the business.
And second having identified those benefits, it should
mmUlllze the costs of infrastructure needed to obtain and
process the data necessary to deliver these benefits.
This paper focuses on IT solutions in the domain of smart
grid data management and analytics using big data tools. The
second section gives some related works in the field that are
ACCENTURE, EDF and ITRON-TERADATA. The third
section presents the proposed solution as a functional
architecture that explains the flow of data from collection
phase (Iow level of the architecture) to higher level where data
management and data analytics are executed in order to make
decisions, produce reports and develop future applications. We
detail each component of the architecture. Finally, we
conclude and give perspectives of OUf work.
11. RELATED WORKS
In the last decades, smart grids have gained lot of interest
from researchers and industrials. But for smart grid data
management, we find only commercial solutions in the
literature. Indeed, the prominent works in the smart grid data
management and advanced analytics are done by
ACCENTURE, ITRON-TERADATA, INFOSYS, IBM... In
this paper we present three of them: ACCENTURE solution,
ITRON-TERADATA system and EDF solution.
A. EDF solution
EDF IT solution is based on mature standard and a
Metering Data Management System (MDMS), which is a
software platform that acquires computing data from multiple
sources and makes this data available after integration,
synchronization and c1eaning. This platform offers:
• A channel for hourly data available the next day on a
web portal.
• Real time channel for alerts.
• Feeding data warehouse for historical data.
EDF data management architecture built from fOUf layers
[3]:
• Data collection layer: a set of smart meters and
programmable devices installed in the grid in order to
collect data consumption.
• Network layer: contain a filed area network, meter
control and wide area network. This layer ensure
978-1-5090-5713-9/16/$31.00 ©2016 IEEEcommunication and data transfer in the grid between
consumption and production end points and control
center.
• Meter data management: ensure data cleaning and
classification before integration them. There are three
types of data, and each one has a different treatment:
o Events: processed by an event processor to feed
outage management application.
o Power consumption: Stored on Meter Data
management Repository for future analysis.
o Operational and system: processed by a
management system for billing forecasts and
demand response prediction, and also
communicate with consumer web portal.
• Applications layer: EDF applications presented in the
solution architecture are: outage management, web
portal, demand response, billing and distribution
sizing.
EDF solution doesn 't consider the big data that is
generated by all the sensors. Moreover, it doesn't take into
account weather and production data to make relevant
analytics and develop applications corresponding to future
needs.
B. ACCENTURE solution
ACCENTURE proposes a system to manage five distinct
smart grid data classes: Operational data, Non-operational
data, Meter usage data, Event message data and Metadata [4].
Each class has its own properties that should be treated and
managed in different ways based on its source, characteristics
and applicability.
ACCENTURE data analytics architecture aims at
addressing the following challenges [5]:
• Matching the data acquisition infrastructure to the
required outcomes.
• Learning to apply new tools, standards and
architectures to manage grid data at scale.
• Transforming processes throughout the business to
take advantage of smart grid technology.
• Managing master data to enable the benefits from
smart grid capabilities.
ACCENTURE solution proposes to extract knowledge
through the following components:
Signal analytics: used substation waveforms data, line
sensor waveforms data... to determine key indicators
such as electrical distance domain.
Events analytics: by detection, classification and filtering
events data.
State analytics: applied on a stream data to create real
time information like: real time electrical state, real
time grid topology ...
Operational analytics: helps to define system
performances, asset health and load forecast.
Customer analytics: applied on consumption and client
data in order to construct demand profiles, demand
response behavior and customer segmentation.
To date, Accenture has catalogued more than 200 smart
grid analytics and several classes oftechnical analytics such as
electrical and device states (including traditional, renewable
and distributed energy resources), power quality, Customer
behavior (especially in terms of demand response)...
ACCENTURE solution needs to integrate more big data
tools to improve not only analysis forecast and decision
making, but all the power system operation. In addition, the
solution focuses on analytical aspects without explaining
processes of data treatment, management and storage.
C. ITRON-TERADATA solution:
TERADATA, which is among the leading companies in
big data infrastructure, and ITRON have developed an Active
Smart grid Analytics solution.
ITRON-TERADATA architecture is based on an Active
Smart grid Analytics (ASA). And the Active Smart Grid Data
Warehouse provides an architectural approach that helps the
application of real-time analytics and enables smarter, faster
decisions.
According to this architecture, the Active Smart Grid
Data Warehouse must accommodate simultaneous loading of
large data volumes from multiple sources: meters, sensors and
control devices and at the same time perform complex
analytics such as: demand response, load forecasts, customer
behavior, future overloads...
These analytics are called active because they analyze and
correlate data from all related systems involved in the smart
grid as it arrives, triggering actions, and participating in
workflows [6].
A key component of ASA, the Utility Logical Data
Model (uLDM), is a comprehensive model for analyzing smart
grid information.
The three presented systems offer solutions for smart grid
management but they are commercial solutions, and just a
little detail, and sometimes no detail, about the data
management architecture components is available.
Moreover, in our solution, we aim at processing not only
stream data that concern real data to detect anomalies and
ensure exact forecasting, but also at providing the storage and
the models needed for managing different kinds of data and
analyzing them.
III. PROPOSED SOLUTION
In this section we propose and detail different components
of our smart grid data management architecture. Our solution,
is on the one side able to manage various and large amount of
data collected from different sources thanks to the big data
tools, and on the other side provides supports for smart grid
applications such as demand prediction through analyticsprocessing. Adding to that, it processes both stream data that
have to be processed for real-time applications and data that
can be stored and accessed by request processing components
by applications such as billing, reporting...
A. Architecture
Figure 1 s h ows the proposed architecture, where four
principal levels are presented:
First level: is the low one and is responsible of collecting
information from different consumption or production
point and which ineludes smart devices such as:
~ AMR: is the technology of automatically coUecting
consumption, diagnostic, and status data
from meter devices [7].
~ Sensors: are devices that respond to a physical
stimulus heat, light, sound, pressure, magnetism
motion, etc. [8], and convert that into electrical
signals they can collect different types of information
like weather, temperature...
~ Smart meter is an electronic device that records
consumption of electric energy, and communicates
that information to the utility for monitoring and
billing [9].
~ Substation: is a component of an
electrical generation, transmission, and distribution
system. They transform voltage from high to low, or
the reverse, they are connected to SCADA for remote
monitoring.
~ Data servers for data (weather, events, ... )
Second level: is communication support level and network
layer, responsible of data transportation and circulation
in the smart grid.
Third level: It is the most important level, because it is
responsible of various data processing: e1eaning,
storage, management and analysis. All these operations
have to transform data into actionable insights decision
making.
Fourth level: represents smart grid applications. In the
above architecture, applications planned are:
consumption prediction, monitoring and production
forecasting.
B. Data management and analytics
The data management system supports the tasks of : data
storage, modeling and executing of different data types;
training forecasting models, which requires retrieving data and
designing features ; retrieving data and scoring forecasting
models at runtime; interfaces to e1ient applications (e.g.
consumption prediction, monitoring, production prediction...).
Figure 1 shows that collected data are first fIltered and
e1eaned to take out u seless information. Then, it is sent to
NoSQL [10] database for storage (flow 2). NoSQL (Not Only
SQL) databases are one of big data technologies to store large
amount of data, wh ere the records don't have the same
structure such as in traditional relational databases.
NoSQL database contains different types of data collected
by intelligent devices presented on collection level of the
architecture, such as: consumption data, production data,
weather data, events, e1ient data, meter data and billing data.
This data is distributed in multiple nodes in intelligent manner
so as to be rapidly accessed. It is also replicated in order to
ensure its availability.
Modeling process (flow 3) is responsible of feature
selection, in order to select relevant features and then create
models that will be stored on models storage, (flow 4) and that
will be used by smart grid applications (flow 5).
Flow number 2' represents stream data, it concems online
analytics. In this case, communication with applications is
directly done after cleaning and fIltering. For example,
Anomaly tracking application uses this type of data combined
with the appropriate model from the models storage.
Applications such as consumption forecast use request
processing, because in this case, it is the application that
makes demands for request processing while using models
(flows number 6,6',7, 7'). All this treatment is done
independently offlow number 2'.
Analytical processes presented below, applied to
heterogeneous kinds of data are not only for creating models
that will be used by smart grid applications, but also to create
dashboards, determine key performances of the grid and also
to forecast demand response behavior.
Our solution is based on big data stack which enables large
data sets processing using the parallel computing paradigm.
We use this potential for query processing, but a l so in data
preparation such as c1eaning, in data preprocessing for feature
selection and also for data training in the modeling process.
This will lead us to provide low time response, which is
necessary for some applications such as energy prediction and
system monitoring.1-------------------------------------------------------------------------------------\
I
Ene rgy prediction M onito r ing A nom aly t racking Ene rgy oonsu mption t racking II
I
I
I
,
A pplicatio n
level
I
I
,
, ------
--~: = = ~---------~: ===~---------~:==~ ~---------~=!"~-----------: == ~: --------~=== ~----
;.,.". ...
-------- r---~! ~------ ·
--------6.- --'6( - - - - --"",
,-
,
,
,
\
, 5
Mode l Storage
M ode ling
!4-_ 4--iI Feat ure !>election
I
I M o del training I
2
Cleaning
Filt er ing
"I 3
( NOSQL Sto r.age
~.---/
Pro duction dat a
M et er dat a
Consumpt ion data
Cli ent dat a
Reqllest prooessing
I Request analy ze r
.-!- I Re q u es! o pt i mi ze r
7'
---+.
I Dat a a c ce ss I
,
I
I
,
,
\
,
,
,
,
,
Data
manage m·ent
leve l
, ,
-' , ~ ~ _ _________________________________ t '" ~ _________________ __________________________________ _ ~J;/
;=-=-=:..::..=:...=..:"-=-'=-=-=-=-="""";.;;;.;;;;.;;.;;;..=..==-=-==-=;.- - - - -',
Comm unicat ion lev el
W ide A r ea Ne twürk ,
Figure I.Proposed Solution Architecture
CONCLUSTON
Wehave proposed agiobai functional smart grid
architecture, with detail of different levels and components,
we focused on the two higher data levels responsible of data
management and analytics using big data infrastructure. The
system we're working on can be used as a service for various
high level applications. We implemented the filtering process
using flume [11] and scoop [12] which are big data tools. We
are working on NOSQL storage. Our future work is focused
on energy prediction as a use case for our smart grid data
management.
REFERENCES
[I] T. Popovic . M. Kezunovic' B. Krstajic, "Smart grid data analytics
for digital protective relay event", June 2013.
[2] F. Fusco, V. Fischer, V. Lonij, P. Pompey, J. Fiot, B. Chen, Y.
Gkoufas, M. Sin, "Data Management System for Energy Analytics
and its Application to Forecasting", 2016.
[3] Marie-Luce PICARO, EDF R&D," Donnees massives pour les smartgrids".
[4] ACCENTURE, "Achieving High Performance in Smart Grid Data
Management.
[5] ACCENTURE, "unlocking the value ofanalytics", 2014.
[6] Itron White Paper, "Active Smart Grid Analytics™ Maximizing
Your Smart Grid Invest".
[7] H A. Mahmood, M.Aamir, M.Trfan, "Design and Tmplementation of
AMR Smart Grid System" , IEEE Electrical Power & Ene rgy
Conference, 2008.
[8] P S.Clara, "Sensor Devices and Sensor Network Applications for the
Smart Grid/Smart Cities", CA, USA, March 2012.
[9] S. Shekara Sreenadh Reddy Depuru, L. Wang, V. Devabhaktuni
"Smart meters for power grid: Challenges, issues, advantages and
status", February 2011 .
[10] R. Kumar, B. Bhushan Parashar, S . Gupta, Y. Shanna, N. Gupta,
"Apache Hadoop, NoSQL and NewSQL Solutions of Big Data",
2014.
[I I] Flume 1.6.0 User Guide- Apache Software Foundation
https://f1ume.apache.org!FlumeUserGuide.html.
[12] Scoop User Guide v1.4.4, Apache Software Foundation.
https://sqoop.apache.org!docs/I.4 .4/SqoopUserGuideohtmI.