To run this code, download the file “AdaptationElements” available at: https://doi.org/10.7910/DVN/VK3CP9. Adjust the file path below with the correct location where you would saved the “AdaptationElements” file
adaptdata <- read_csv("C:/Users/mlolita/Downloads/detailed.csv")
overview<- read_csv("C:/Users/mlolita/Downloads/Overview.csv")
This section classifies climate-related hazards mentioned in the
ElementText
column of the adaptdata
dataset
using regex-based keyword matching, aligned with the protocol’s
definitions.
Column Creation
A new column HazardType
is added to store the hazard
category for each entry. This column is positioned after
ElementText
.
Keyword Mapping
Hazard categories are mapped to regex patterns that represent keywords
found in the protocol definitions. These keywords are matched
case-insensitively within the ElementText
.
Tagging
The script loops over all hazard patterns and tags any matching rows
accordingly.
Hazard Category | Keywords Used |
---|---|
Extreme temperature | heat wave, heatwave, excessive heat, high temperature, extreme temperature, extreme heat, extremely low, cold waves, cold wave, snow, ice, frost, freeze, severe winter, maximum and minimum, warmer, higher occurrence of hot, glacier, snowfall, evapo, extreme weather, heat stress, hot days, hot nights |
Storm | storm, tropical storm, cyclone, cyclones, Cyclonic activity, typhoon, typhoons, hail, lightning, thunderstorm, heavy rain, windstorm, sand storm, dust storm, tornado, violent rains, torrential, strong winds |
Drought | drought, drought cycle, prolonged droughts, drougths, dry spell, dry days, aridity |
Wildfire | wildfire, wild fire, forest fire, land fire, bush fire, pasture fire, fire |
Landslide | landslide, land slide |
Flood | flood, coastal flood, riverine flood, flash flood, ice jam flood, inundation |
Change in temperature | change in temperature, alteration in average temperature, temperature change, temperature rise, temperature drop, consistent change in temperature, rise in temperature, increase in temperature, increases in temperature, temperature increase, increasing temperature, increased temperature, increase temperatures, higher temperatures, average temperature, annual temperature, annual mean temperature, annual air temperature, minimum temperatures, maximum temperatures, number of warm days, rising temperatures, warming temperatures |
Change in precipitation | change in precipitation, alteration in precipitation, precipitation patterns, shift in timing, shift in amount, shift in intensity, shift in frequency, rainfall, precipitation, distribution of rains, disruption of rains, fluctuation of rains, shorter rain, earlier and ending later |
Salinization | salinization, salt content, saltwater, salt water, salinity, increase in salt content |
Land degradation | land degradation, decline in land quality, decline in land health, pasture degradation, desertification, loss of organic matter, degradation of land, erosion, soil erosion |
Sea level rise | sea level rise, sea level, increase in sea level, coastal flooding, coastal erosion, beach loss, coastline retreat, submersion, water mass |
Sea temperature | sea temperature, sea surface temperature, change in sea temperature, ocean temperature, water temperature, surface temperature, seawater surface |
Ocean acidification | ocean acidification, reduction in pH, acidity, acidification, acidic, coral |
Pest and disease | pest, disease, epidemic, infestation, invasion, insect infestation, vector-borne, biological event, invasive species |
# Count number of 'Hazard' elements that received a HazardType
assigned_hazards <- adaptdata %>%
filter(Element == "Hazard" & HazardType != "") %>%
nrow()
# Total number of Hazard elements
total_hazards <- adaptdata %>%
filter(Element == "Hazard") %>%
nrow()
# Display the result
cat("Hazard rows assigned a category:", assigned_hazards, "out of", total_hazards, "\n")
## Hazard rows assigned a category: 922 out of 1082
cat("**Coverage Rate:**", round(assigned_hazards / total_hazards * 100, 1), "%\n")
## **Coverage Rate:** 85.2 %
# Load DT
library(DT)
# Filter Hazard elements and show them
hazard_table <- adaptdata %>%
filter(Element == "Hazard") %>%
select(Country, Document, ElementText, HazardType) %>%
mutate(
ElementText = ifelse(
nchar(ElementText) > 50,
paste0("<span title='", ElementText, "'>", substr(ElementText, 1, 50), "...</span>"),
ElementText
)
)
# Display the table
datatable(
hazard_table,
escape = FALSE, # Allow HTML rendering
options = list(pageLength = 10, autoWidth = TRUE),
caption = "📄 Hazard elements and their assigned HazardType"
)
We assign a SystemType to each row of adaptdata where the element is “System at risk”. This classification is based on the official reporting protocol definitions, which describe different climate-sensitive systems affected by hazards (e.g., agriculture, biodiversity, infrastructure).
To do this, we use a dictionary of keyword patterns that match text found in the ElementText column. These keywords are derived from the protocol and expanded where needed to capture variations in language.
Only rows where the element is “System at risk” are considered.
The following table summarizes the system categories used for classification and the corresponding keyword patterns:
System Type | Keywords Used |
---|---|
Crop | crop, cropping systems, crop production, yield, cultivated area, agriculture, agricultural, agricultural production, agricultural pests, pest, irrigated, rain-fed, farming, land use planning, crop loss, agroecological zone, production |
Livestock | livestock, pasture, pastoralist, pastoral area, grazing, animal health, productivity losses, herder, livestock loss |
Fisheries and aquaculture | fish, fisheries, aquaculture, fishing, marine harvest |
Forest | forest, forestry, forest product, tree, non-timber |
Terrestrial | terrestrial ecosystem, terrestrial, drylands, land resource, natural resource, ecosystem structure, ecosystem services, ecological system, desertification, land degradation, soil degradation, soil erosion, environment |
Freshwater | freshwater, wetlands, inland wetlands, water resource, drinking water, potable water, water quality, water availability, water supply, river, water scarcity, water shortage, hydrological cycle, water stress, water table, water source, groundwater, water supplied, dams |
Biodiversity | biodiversity, flora, fauna, species, species extinction, extinction, range of species, ecosystem change |
Coastal | coast, coastal, marine, mangrove, ocean, coral, beach, coastal erosion, sea level rise, blue carbon, coastal ecosystem, coastal zone, eutrophication, algual bloom |
Food and nutrition | food security, food insecure, food insecurity, nutrition, malnutrition, hunger, food safety, food availability, famine, undernutrition, overnutrition, obesity |
Gender and inclusion | gender, women, youth, children, elderly, inclusion, social exclusion, vulnerable group, vulnerable population, minority group, indigenous, small-scale producer, pastoralist, fishing communities, forest-based communities, high-risk regions |
Livelihoods and poverty | livelihood, poverty, income, employment, labor, economic activity, loss of income, loss of livelihood, safety net, insurance, socio-economic development, tourism, subsistence, workforce, economic, tourists, outdoor activities, ski, vacation |
Health | health, mental health, morbidity, mortality, vector-borne disease, water-borne disease, infectious disease, respiratory disease, malaria, epidemic, climate-sensitive disease, heat morbidity, disease, deaths, human lives, pollution, life expectancy |
Infrastructure and services | infrastructure, critical infrastructure, services, critical services, road, bridge, electricity, power supply, energy, water supply, sanitation, hygiene, education, school, building, housing, settlement, evacuation, telecom, transport, waste management, power, hydropower, railways, port, industry, material, industries |
Human security and Peace | migration, displacement, conflict, armed conflict, national security, human security, organized conflict, climate-induced migration, refugee, peace |
# Compute number of System at risk rows
total_system_rows <- sum(adaptdata$Element == "System at risk", na.rm = TRUE)
# Count how many received a classification
assigned_system_rows <- sum(adaptdata$Element == "System at risk" & adaptdata$SystemType != "", na.rm = TRUE)
# Print basic coverage stats
cat("**System at Risk Coverage:**", assigned_system_rows, "of", total_system_rows,
"rows were successfully assigned a category.\n")
## **System at Risk Coverage:** 989 of 1110 rows were successfully assigned a category.
cat("**Coverage Rate:**", round(assigned_system_rows / total_system_rows * 100, 1), "%\n")
## **Coverage Rate:** 89.1 %
We classify sectors in two ways:
Sector
and SystemType
columns.
This follows the sector taxonomy defined in the MPGs (Table 1).The tagging logic first applies regular expressions to the text in
Sector
, with some fallback rules based on the
SystemType
when relevant. The classification is additive
and accounts for overlapping sector concepts.
Below is a summary table of sector categories and the associated keywords used.
IPCC Sector Category | Keywords / Match Terms |
---|---|
Food, fiber and other ecosystem products | agri, agro, food, crop, livestock, animal, fish, fisheries, aquacultur, seed, irrigation, value chain, land use, land tenure, land and forestry, agriculture, Agriculture and food security, Agriculture, Climate services, Others, forest and other land uses |
Terrestrial and freshwater ecosystems | forest, environ, ecosystem, biodiversity, natural, ecology, wildlife, REDD, peatland, protected area |
Ocean and coastal ecosystems | ocean, marine, coast, coastal land use, coastal zone, blue carbon, mangrove |
Water, sanitation and hygiene | sanitation, water and sanitation, sewerage, hygiene, water use, water security |
Cities, settlements and key infrastructure | city, cities, urban, settlement, infrastructure, housing, habitat, industr, waste, transport, energy, landfills, mining, mineral resources, telecommunications |
Health, wellbeing and communities | health, well-being, wellbeing, nutrition, culture, territorial communities, local knowledge |
Livelihoods, poverty and sustainable development | social, poverty, people, econom, capacity, education, employment, tourism, rural development, sustainable development, economic and social infrastructure |
Crosscutting | cross-cutting, cross cutting, cross-sectoral, innovation, research, R&D, integration, empower, gender, women, youth, disaster, risk, climate, meteo, warning, governance, legislation, policy, policies, institution, M&E |
Not specified | Not specified |
GGA Sector Theme | Mapped IPCC Sector Categories |
---|---|
Water and sanitation | Water, sanitation and hygiene |
Food and agriculture | Food, fiber and other ecosystem products |
Health | Health, wellbeing and communities |
Biodiversity and ecosystems | Terrestrial and freshwater ecosystems; Ocean and coastal ecosystems |
Infrastructure and human settlements | Cities, settlements and key infrastructure |
Poverty eradication and livelihoods | Livelihoods, poverty and sustainable development |
Not specified | Not specified |
library(DT)
# Count rows classified under SectorType
total_sector_rows <- sum(!is.na(adaptdata$Sector))
classified_sector_rows <- sum(adaptdata$SectorType != "")
classified_sector_rows_gga <- sum(adaptdata$SectorType_GGA != "")
cat("**SectorType (IPCC):**", classified_sector_rows, "of", total_sector_rows, "rows tagged\n")
## **SectorType (IPCC):** 4418 of 4456 rows tagged
cat("**SectorType_GGA (GGA):**", classified_sector_rows_gga, "of", total_sector_rows, "rows tagged\n")
## **SectorType_GGA (GGA):** 4158 of 4456 rows tagged
cat("**Coverage Rate:**", round(classified_sector_rows / total_sector_rows * 100, 1), "%\n\n")
## **Coverage Rate:** 99.1 %
# Ensure both datasets have the Country column
# Check for extra whitespace or formatting issues
adaptdata$Country <- trimws(adaptdata$Country)
overview$Country <- trimws(overview$Country)
# Merge metadata from 'overview' to 'adaptdata' by Country
adaptdata <- merge(adaptdata, overview, by = "Country", all.x = TRUE)
# Convert any list columns to character columns
adaptdata[] <- lapply(adaptdata, function(col) {
if (is.list(col)) {
sapply(col, function(x) paste(unlist(x), collapse = "; "))
} else {
col
}
})
write.csv(adaptdata,"AdaptationElementsProcessed.csv")
Hazard Type:
A few additional climate-related keywords were added to improve
coverage. For example, terms like “dry days”, “surface air
temperature”, and “heat stress” appeared frequently and
were added under appropriate hazard categories.
However, we also encountered terms such as “runoff”, “solar
radiation”, and “winds” that were not clearly assignable
to a single hazard category and remain unclassified for now.
System at Risk:
We expanded the keyword list to better capture common themes in the
data:
⚠️ Note: We observed several rows tagged as System at risk that likely describe hazards rather than systems, as they do not mention any system but describe a climate hazard (e.g., drought or storms). This may warrant a second look or reclassification. Some examples
Sector Tagging (IPCC & GGA):
Sector tagging shows strong overall coverage,
particularly due to the mapping from both direct sector mentions and
SystemType
.
That said, a number of rows still remain unclassified,
either because sector information wasn’t mentioned explicitly in the
original source or because it was too ambiguous to map reliably. It’s
unclear whether these untagged rows should be a concern, but they could
be flagged for manual review depending on use case.