Household Energy Access Assessment in DRC: HEDERA & APIDE

The objective of this project is to analyze access to energy (electricity and cooking solution) in rural areas of the Democratic Republic of Congo.

The study has been implemented by APIDE, a local NGO working in the regions of Kitutu, Kaituga, and Mwenga, in the eastern part of the country, about 150 km from the border with Burundi and Rwanda.

Data collection, processing, and visualization have been supported by the HEDERA Impact Toolkit software. This digital report has been created as a jupyter-book.

Relevant information concerning the use, associated costs, and various attributes of access to electricity and cooking solutions has been collected from households APIDE is working with in rural and remote areas, using the HEDERA collect mobile app based on the OpenDataKit open-source framework. The HEDERA Impact Toolkit allows institutions to efficiently establish a baseline for monitoring progress towards the Sustainable Development Goals (SDGs) and track the progress thereof, following, for example, the Multi-tier Framework (MTF) for SDG7, recently established by The World Bank, and the Progress out of Energy Poverty Index (PEPI) N. Realpe, PhD Thesis 2017.


HEDERA provided a mobile application for data collection, as well as digital material for remote training. Members of APIDE field staff were trained during a one-day workshop (held by one member of the organization). During 10 days, more than 220 data point were collected (household rosters, electricity assessment, cooking solution assessment).

You can also visit the full report of this case study.

HIT_PATH = '../../../../src/'
institution_id = 7
lang = 'en'
import os,sys, folium
sys.path.insert(0, os.path.normpath(os.path.join(os.path.abspath(''), HIT_PATH)))
import hedera_types as hedera
import odk_interface as odk

mfi = hedera.mfi(institution_id,setPathBook=True)
data = mfi.read_survey(mfi.odk_data_name)
mfi.HH = odk.households(data)

Collection overview


The Map allows to visualize the location of the collected GPS data. Missing data points are displayed with coordinated (0,0)

import matplotlib.pyplot as plt

select = mfi.HH['GPS_Latitude']!=0
HH_with_GPS = mfi.HH[select]

# change plot layout
plt.rcParams.update({'font.size': 20})
#Define initial geolocation
lat_center = HH_with_GPS['GPS_Latitude'].mean() 
lon_center = HH_with_GPS['GPS_Longitude'].mean()
max_var = max(HH_with_GPS['GPS_Latitude'].var(),HH_with_GPS['GPS_Longitude'].var())
zoom_start = 9
if max_var>0.1:
    zoom_start -= 1
if max_var>1:
    zoom_start -= 1
initial_location = [lat_center, lon_center]

# create map
map_osm = folium.Map(initial_location, zoom_start=zoom_start)
colors = {0: hedera.tier_color(0), 1 : hedera.tier_color(1), 2 : hedera.tier_color(2), 
          3 : hedera.tier_color(3), 4 : hedera.tier_color(4), 5: hedera.tier_color(5)}
HH_with_GPS.apply(lambda row:folium.CircleMarker(location=[row["GPS_Latitude"], row["GPS_Longitude"]],
                                        radius=10,fill_color="#FF5733",popup=(row["GPS_Latitude"],row["GPS_Longitude"],row["locality"])).add_to(map_osm), axis=1)

Data per location

Data have been collected in three different areas in Eastern DRC.

import matplotlib.pyplot as plt
# this is needed if the surveys do not cover all states/offices
empty = []
for o in mfi.offices:
    select = mfi.HH['locality']==o
    if sum(select)==0:
for o in empty:

Dates of Collection

The following figure shows the amount of surveys per day

import numpy as np
S = odk.get_survey_duration(data)
dates = np.unique(np.array(mfi.HH['date']))
ind = np.arange(len(dates))
dates_plot = []
dates_labels = []

mean_e = []
mean_c = []
mean_tot = []

for d in dates:
    select = mfi.HH['date']== d
    dates_plot.append( sum(select) )
    # get surveys data on a diven date
    surveys = S[select]    
    selectE = surveys['electricity']>0
    surveysE = surveys[selectE]
    selectC = surveys['cooking']>0
    surveysC = surveys[selectC]
    selectT = surveys['total']>0
    surveysT = surveys[selectT]
    import matplotlib.pyplot as plt

# change plot layout
plt.rcParams.update({'font.size': 20})
# survey per date    
fig, ax = plt.subplots(figsize=(10,8)), dates_plot, width=0.95,edgecolor='white')
plt.xticks(ind, dates, rotation=90)
ax.yaxis.grid(color='grey', linestyle='--', linewidth=0.5)

Average Duration

Average survey duration per day was on average between 5 and 27 minutes.

Note: Some interviews only covered the household roster and are therefore much shorter.

import matplotlib.pyplot as plt

# change plot layout
plt.rcParams.update({'font.size': 20})
# survey duration
fig, ax = plt.subplots(figsize=(10,8)), mean_tot, width=0.95,edgecolor='white',color='blue',label='Total')
plt.xticks(ind, dates, rotation=90)
                       loc='upper center').set_draggable(True)
ax.yaxis.grid(color='grey', linestyle='--', linewidth=0.85) # vertical lines

Electricity Access Attributes

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 20})

Primary Electricity Sources

import matplotlib.pyplot as plt

Household Appliances

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 14})
[appliances_per_x,u_id] = odk.compute_appliances_per_x(data,mfi.HH,mfi.offices,'locality')

MTF Electricity Index vs. Primary Source

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 18})

Access to Modern Cooking Solutions Attributes

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 20})

Primary Cooking Stoves

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 18})

MTF Index (Access to Cooking Solutions) vs. Primary Cooking Fuel

import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 18})