Title: | Detect SDGs and Targets in Text |
---|---|
Description: | Identify 17 Sustainable Development Goals and associated 169 targets in text. |
Authors: | Yingjie Li [aut, cre] |
Maintainer: | Yingjie Li <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.7.3 |
Built: | 2025-02-23 05:17:32 UTC |
Source: | https://github.com/yingjie4science/sdgdetector |
Users Can Add Customized Patterns for Each SDG or Target
add_sdg_pattern(sdg_id, x, operator = "AND", quiet = FALSE)
add_sdg_pattern(sdg_id, x, operator = "AND", quiet = FALSE)
sdg_id |
SDG Goal's ID or Target's ID, in the format of 'SDGx_y', e.g., SDG1_1, SDG2_general |
x |
A vector of strings |
operator |
'AND', 'OR' to combine a vector of keywords for identifying SDG Goals or Targets. |
quiet |
Logical. Suppress info message |
A regerx string
terms_new <- c("improve", "farmer", "income") add_sdg_pattern(sdg_id = 'SDG1_2', x = terms_new, operator = 'AND')
terms_new <- c("improve", "farmer", "income") add_sdg_pattern(sdg_id = 'SDG1_2', x = terms_new, operator = 'AND')
List of Names and ISO Code for Countries
codelist_panel
codelist_panel
codelist_panel
A data frame with 28941 rows and 55 columns:
Country name in English
2 & 3 letter ISO country codes
Year
...
https://en.wikipedia.org/wiki/List_of_countries_and_territories_by_land_and_maritime_borders
Datasets of country and region names.
country_region_names
country_region_names
country_region_names
: A data frame with 644 rows and 3 variables
Yingjie Li [email protected]
Detect country or region names in text for further mapping.
detect_region(x, col)
detect_region(x, col)
x |
Data frame or a string |
col |
Column name for text to be assessed |
Returns the tool text outputs.
x <- c("This paper explores the method and results from an independent evidence based assessment of Australia's progress towards the SDGs", "Last year alone, the United States experienced 14 separate billion-dollar disasters related to climate change") col <- data.frame(x) regions <- detect_region(x, col)
x <- c("This paper explores the method and results from an independent evidence based assessment of Australia's progress towards the SDGs", "Last year alone, the United States experienced 14 separate billion-dollar disasters related to climate change") col <- data.frame(x) regions <- detect_region(x, col)
New changes:
func_AND_vector(v)
func_AND_vector(v)
v |
a vector of characters |
Compare to the earlier version, we made the following changes
Instead of combining multiple term lists by OR
for one particular target,
it is more intuitive and accurate to add each alternative term list to the search
term table or database directly.
Added Look around
function to more accurately match SDG targets.
Use AND
to Concatenate a Vector of Terms
A character
words <- c('apple', 'bean', 'food') func_AND_vector(v= words)
words <- c('apple', 'bean', 'food') func_AND_vector(v= words)
OR
to Concatenate a Vector of TermsUse OR
to Concatenate a Vector of Terms
func_OR_vector(v)
func_OR_vector(v)
v |
a vector of characters |
A character
words <- c('apple', 'bean', 'food') func_OR_vector(v= words)
words <- c('apple', 'bean', 'food') func_OR_vector(v= words)
The Names, ID, and Descriptions of all the 17 SDGs and 169 Targets
list_of_un_goals_targets
list_of_un_goals_targets
list_of_un_goals_targets
A data frame with 169 rows and 3 columns:
The ID of each SDG
The name of each SDG
The name of each Target
The description for each Target
https://unstats.un.org/sdgs/indicators/indicators-list/
Look around to match pattern in a sentence
lookaround_nearby_n(word_ls1, word_ls2, n, exclude = "", third_AND_string = "")
lookaround_nearby_n(word_ls1, word_ls2, n, exclude = "", third_AND_string = "")
word_ls1 |
is a string, which includes a list of words connected by "|" that indicates 'OR' |
word_ls2 |
is a string, which includes a list of words connected by "|" that indicates 'OR' |
n |
is a number, indicates the number of words to look around |
exclude |
is a vector, including a list of words to be excluded from match |
third_AND_string |
similar to word_ls1 or word_ls2, it is a string that includes a list of words connected by "|" that indicates 'OR' |
A regex string
con1 <- c('apple', 'bean', 'food') con2 <- c('big', 'delicious') lookaround_nearby_n(word_ls1 = con1, word_ls2 = con2, n = 2, exclude = "", third_AND_string = "")
con1 <- c('apple', 'bean', 'food') con2 <- c('big', 'delicious') lookaround_nearby_n(word_ls1 = con1, word_ls2 = con2, n = 2, exclude = "", third_AND_string = "")
SDG bar plot
plot_sdg_bar(data, sdg = "sdg", value = "value", quiet = FALSE)
plot_sdg_bar(data, sdg = "sdg", value = "value", quiet = FALSE)
data |
Data frame as the input |
sdg |
Vector with SDG code to be visualized. |
value |
The value, e.g., number of SDGs, to be show in the thematic map |
quiet |
Logical. Suppress info message |
Returns the tool text outputs.
data("sdgstat") plot_sdg_bar(sdgstat, sdg = "SDG", value = "Value")
data("sdgstat") plot_sdg_bar(sdgstat, sdg = "SDG", value = "Value")
SDG map plot
plot_sdg_map(data, sdg = sdg, value = value, country = country, by_sdg = TRUE)
plot_sdg_map(data, sdg = sdg, value = value, country = country, by_sdg = TRUE)
data |
Data frame as the input |
sdg |
Vector with SDG code to be visualized. |
value |
The value, e.g., number of SDGs, to be show in the thematic map |
country |
Country that are associated with the SDGs. |
by_sdg |
If mapping by SDG, TRUE or FALSE. |
Returns the tool text outputs.
data("sdgstat") plot_sdg_map(sdgstat, sdg = "SDG", value = "Value", country = "Country", by_sdg = FALSE )
data("sdgstat") plot_sdg_map(sdgstat, sdg = "SDG", value = "Value", country = "Country", by_sdg = FALSE )
Color scheme for the 17 SDGs
sdg_color(x)
sdg_color(x)
x |
A number, which indicates the SDG ID |
HTML color code of a specified SDG
sdg_color(1) sdg_color(x = 1:17)
sdg_color(1) sdg_color(x = 1:17)
The sdg_icon
function provides the specific icon for each SDG
sdg_icon(x, res = 200)
sdg_icon(x, res = 200)
x |
Numeric code for each SDG, ranging from 1 to 17 |
res |
Resolution of SDG icon. Default: |
sdg_icon(x = 17, res = 300)
sdg_icon(x = 17, res = 300)
List SDG Icons
sdg_icons
sdg_icons
sdg_icons
: External pointer of class "magick-image"
Database of SDG search terms
Datasets of SDG keys.
data(SDG_keys) SDG_keys
data(SDG_keys) SDG_keys
An object of class data.frame
with 557 rows and 3 columns.
SDG_keys
: A data frame with 557 rows and 3 variables
The search terms are developed at the “Target” level (SDG Goal/Target/Indicator) to extract SDG-related statements. These SDG search terms can be "direct mention", such as "SDG 1", or "indirect mention", which means a statement aligns with the description of certain SDGs or targets. For example, "Our company has embraced CO2 emissions mitigation as a priority within our sustainability strategy") is an indirect mention of "SDG 13.a" ("Implement the commitment... in the context of meaningful mitigation actions and ...").
Yingjie Li [email protected]
data(SDG_keys)
data(SDG_keys)
Identify 17 Sustainable Development Goals and associated 169 targets in text.
SDGdetector(x, col, quiet = FALSE)
SDGdetector(x, col, quiet = FALSE)
x |
Data frame or a string |
col |
Column name for text to be assessed |
quiet |
Logical. Suppress info message |
In 2015, leaders worldwide adopted 17 Sustainable Development Goals (SDGs) with 169
targets to be achieved by 2030 (https://sdgs.un.org). The framework of SDGs serves
as a blueprint for shared prosperity for both people and the earth. SDGdetector
identifies both direct and indirect expressions of SDGs and associated targets in
chunks of text. It takes a data frame with a specified column of text to process as
inputs and outputs a data frame with original columns plus matched SDGs and targets.
Data frame with the same columns as the df
plus one extra column named "sdgs", which
list the occurrence (or hits) of SDG goals or targets detected from each sentence in rows.
Users can further use our function summarize_sdg()
to clean the result for visulization.
my_col <- c("our goal is to end poverty globally", "this product contributes to slowing down climate change") my_text <- data.frame(my_col) SDGdetector(my_text, my_col)
my_col <- c("our goal is to end poverty globally", "this product contributes to slowing down climate change") my_text <- data.frame(my_col) SDGdetector(my_text, my_col)
Datasets of SDG statistics.
sdgstat
sdgstat
sdgstat
: A data frame with 62 rows and 4 variables
Yingjie Li [email protected]
Datasets of shapefiles..
shp
shp
shp
: A data frame with 241 rows and 6 variables
Yingjie Li [email protected]
Summarize results from SDGdetector at either the Goal level or Target level.
summarize_sdg(data, sum_by = "target", quiet = FALSE)
summarize_sdg(data, sum_by = "target", quiet = FALSE)
data |
Data frame or a string |
sum_by |
The group level to be chosen for data summary. Default parameter is "target", and can also set at "goal" level. |
quiet |
Logical. Suppress info message |
Data frame with at least one column named "SDG" or "Target", and one column Freq
that
represent the total hits.
library(SDGdetector) df <- data.frame(col = c( 'our goal is to end poverty globally', 'this product contributes to slowing down climate change')) data <- SDGdetector(x = df, col = col) summarize_sdg(data, sum_by = 'target', quiet = FALSE)
library(SDGdetector) df <- data.frame(col = c( 'our goal is to end poverty globally', 'this product contributes to slowing down climate change')) data <- SDGdetector(x = df, col = col) summarize_sdg(data, sum_by = 'target', quiet = FALSE)