gt corn table

Making tables with the R package gt

Last updated on May 2, 2020 25 min read agriculture, tables

gt corn table

Making tables with the R package gt

Last updated on May 2, 2020 25 min read agriculture, tables

Overall notes about the R package gt: I like it! I’m still learning the syntax but it seems intuitive and user-friendly. Would recommend.

It’s worth noting at the beginning of this post that the tables are rendered a little differently via blogdown than they would be if they were just knit into an html file. Therefore, please don’t view this page in dark mode…you won’t be able see half the table rows. After sleuthing through the gt github, I believe this problem should be fixed with the next release on CRAN.

Preamble

Making tables in R is something I’ve dabbled lightly in but never really committed to. The story usually goes: I have some data to present in a table…I try making table in R…I successfully make a table but I can’t figure out how to format it exactly to my liking…I get frustrated…I abandon my R table and go to Microsoft word.

I have no beef with the other R packages intended for table making, and I’m sure with more finesse they could give me what I’m looking for. However, I’m excited to try out gt because it is “the grammar of tables” and was designed to mirror the layered style of creating ggplot2 figures.

I used this self-imposed table-making opportunity to play around with gathering data from the USDA’s National Agricultural Statistics Service (NASS). I used their quickstats page. It took me a couple tries to get the sort of data I was looking for (there are a LOT of options to click through). I decided to look at chemical use on corn acres, as well as some other basic corn stats for Iowa. We do grow a lot of corn here (and no, it’s not sweet corn). If you want to see the raw data, you can find it on my github.

(NB: There were gaps where data were available so I picked time points roughly 4-5 years apart)

Tables

First things first, I had to do a little data cleaning to get rid of some unwanted text and columns in the data-frame.

chem_apps <- 
  read_csv("chem_data.csv") %>%
  mutate(param    = str_remove_all(`Data Item`, 
                                    "CORN - APPLICATIONS, MEASURED IN "),
         app_type = str_remove_all(Domain,
                                   "CHEMICAL, "),
         name     = str_extract(`Domain Category`,
                                   "(?<=\\().+(?=\\))"),
         Value    = str_replace_all(Value, "\\(D\\)", "NA"),
         Value    = str_remove_all(Value, "\\,"),
         Value    = as.numeric(Value)) %>%
  select(Year, param, app_type, name, Value)
head(chem_apps)

## # A tibble: 6 x 5
##    Year param app_type  name                             Value
##   <dbl> <chr> <chr>     <chr>                            <dbl>
## 1  2018 LB    HERBICIDE 2,4-D, 2-EHE = 30063                NA
## 2  2018 LB    HERBICIDE 2,4-D, DIMETH. SALT = 30019         NA
## 3  2018 LB    HERBICIDE 2,4-DB, DIMETH. SALT = 30819        NA
## 4  2018 LB    HERBICIDE ACETOCHLOR = 121601            8823000
## 5  2018 LB    HERBICIDE ATRAZINE = 80803               6621000
## 6  2018 LB    HERBICIDE BAPMA SALT OF DICAMBA = 100094      NA

Next step: build a basic table with gt. Here I’ve filtered my data-frame and rearranged it slightly (pivot_wider!!) and then I’ve gone right into using gtby using pipe operators. It’s quite easy to add a title and subtitle, as well as a source note. It’s also easy to use Markdown formatting by putting your title within the md() function. I appreciate that you can format the numbers to have commas so they are easier to read. Since I kept everything in pounds, these numbers are quite large. Iowa uses a lot of herbicide and fertilizer on a yearly basis!

tab_v1 <- 
  chem_apps %>%
# Get data just on total chemical usage
  filter(name == "TOTAL" | name == "NITROGEN" & param == "LB") %>%
# convert into a wider dataframe that is more table-ready  
  pivot_wider(id_cols = c(Year, Value), 
              names_from = app_type, 
              values_from = Value) %>%
# now add in gt 
  gt() %>%
# Title
  tab_header(title    = md("**Chemical Use on Cornfields in Iowa**"), 
             subtitle = "Total pounds of applied herbicide and fertilizer") %>%
# Source note  
  tab_source_note(source_note = "Source: USDA NASS") %>%
# Format numbers to have commas   
  fmt_number(columns = vars(HERBICIDE,FERTILIZER), 
             use_seps = TRUE,
             decimals = 0) %>%
# this should remove row striping but it doesn't work when using blogdown to render scripts  
  opt_row_striping(row_striping = FALSE)
tab_v1

Chemical Use on Cornfields in Iowa
Total pounds of applied herbicide and fertilizer
Year	HERBICIDE	FERTILIZER
2018	32,371,000	1,784,300,000
2014	30,240,000	1,842,200,000
2010	26,195,000	1,806,600,000
2005	24,726,000	1,653,200,000
2000	24,518,000	1,533,000,000
1995	32,957,000	1,364,600,000
1990	NA	1,561,000,000
Source: USDA NASS

Let’s add a footnote because this table doesn’t include all types of fertilizer.

tab_v1 %>%
  tab_footnote(footnote = md("Nitrogen Fertilizer *only*"),
               locations = cells_column_labels("FERTILIZER"))

Chemical Use on Cornfields in Iowa
Total pounds of applied herbicide and fertilizer
Year	HERBICIDE	FERTILIZER¹
2018	32,371,000	1,784,300,000
2014	30,240,000	1,842,200,000
2010	26,195,000	1,806,600,000
2005	24,726,000	1,653,200,000
2000	24,518,000	1,533,000,000
1995	32,957,000	1,364,600,000
1990	NA	1,561,000,000
Source: USDA NASS
¹ Nitrogen Fertilizer only

Let’s just look at the values in this table for a moment: it doesn’t look like total herbicide or fertilizer use has changed all that much from the 1990s to the near present. OK! Perhaps we should examine certain chemicals instead of total chemical use to make this table more interesting. The first part of this chunk is a different subset of the data and the second part of the chunk is where I made it into a table.

chem_subset <- 
  chem_apps %>%
# filtering for specific herbicides and fertilizer  
  filter(str_detect(name, "^DICAMBA")|str_detect(name, "^GLYPHOSATE")|name == "NITROGEN"|name == "PHOSPHATE") %>%
  filter(param == "LB") %>%
# again, making into a wider format  
  pivot_wider(id_cols = Year, 
              names_from = name, 
              values_from = Value) %>%
# getting totals for formulations  
  mutate(DICAMBA    = rowSums(select(., starts_with("DICAMBA")), na.rm = TRUE),
         GLYPHOSATE = rowSums(select(., starts_with("GLYPHOSATE")), na.rm = TRUE)) 

# Table time! 
chem_subset %>%
  select(Year, NITROGEN, PHOSPHATE, DICAMBA, GLYPHOSATE) %>%
  gt(rowname_col = "Year") %>%
  tab_header(title    = md("**Chemical Use on Cornfields in Iowa**"), 
             subtitle = "Total pounds of applied herbicide and fertilizer")%>%
  tab_stubhead(label = "Year") %>%
  tab_spanner(label = "Fertilizer", columns = c("NITROGEN", "PHOSPHATE")) %>%
  tab_spanner(label = "Herbicide", columns = c("DICAMBA", "GLYPHOSATE")) %>%
  fmt_number(columns = everything(), 
             use_seps = TRUE,
             decimals = 0) %>%
  tab_footnote(footnote = "Includes all formulations with the active ingredient",
               locations = cells_column_labels(c("DICAMBA", "GLYPHOSATE"))) %>%
  tab_source_note(source_note = "Source: USDA NASS") %>%
  opt_row_striping(row_striping = FALSE)

Chemical Use on Cornfields in Iowa
Total pounds of applied herbicide and fertilizer
Year	Fertilizer		Herbicide
Year	NITROGEN	PHOSPHATE	DICAMBA¹	GLYPHOSATE¹
2018	1,784,300,000	698,200,000	323,000	9,759,000
2014	1,842,200,000	600,500,000	91,000	10,047,000
2010	1,806,600,000	620,300,000	11,000	9,390,000
2005	1,653,200,000	579,000,000	164,000	2,230,000
2000	1,533,000,000	503,200,000	988,000	479,000
1995	1,364,600,000	534,500,000	1,373,000	158,000
1990	1,561,000,000	609,000,000	833,000	0
Source: USDA NASS
¹ Includes all formulations with the active ingredient

This table adds a couple new elements:

Years are a stubhead, which is sort of equivalent to transitioning a column to a rowname in a table. In order to transition a column to a stubhead you must specify the column in the initial gt() call. In other words, I don’t believe you can make a column into stubhead by adding another layer to your table.
Column spanners ‘Fertilizer’ and ‘Herbicide’ now label multiple column names.

The numbers in this table suggest that both N & P fertilizer use hasn’t changed much since 1990. This checks out with our initial observation. However there are changes in herbicide usage (mostly because I picked herbicides that I know have changed usage over the years..). Glyphosate use went from 0 lbs in 1990 to over 10 million pounds in 2014. This mirrors the release of ‘Round-up Ready’ corn in 1996-ish (glyphosate is the active killing ingredient in Round-Up) and its widespread adoption. Dicamba (which again is merely the active killing ingredient) is not a new herbicide, but it has gotten renewed press and scrutiny since it has come back onto the market due to the proclivity of the herbicide to volatilize, drift, and damage surrounding crop fields. News stories here and here. This table doesn’t show the type of Dicamba use trend I predicted, but, importantly, I lumped all the herbicides together that use Glyphosate or Dicamba as their active ingredient (peep that footnote). There are actually a myriad of formulations that have been developed over the years that use these active ingredients. The logical next step is a table with all the different formulations separated…

chem_subset2 <-   
  chem_apps %>%
  filter(str_detect(name, "^DICAMBA")|str_detect(name, "^GLYPHOSATE")) %>%
  filter(param == "LB") %>%
  pivot_wider(id_cols = Year, 
              names_from = name, 
              values_from = Value) %>%
  janitor::remove_empty(which = "cols") %>%
  replace(., is.na(.), 0)

chem_subset2 %>%
  gt(rowname_col = "Year") %>%
  tab_header(title    = md("**Herbicide Use on Cornfields in Iowa**"), 
             subtitle = "Total pounds of applied herbicide")%>%
  tab_stubhead(label = "Year") %>%
  cols_move(columns = c("DICAMBA, DIMET. SALT = 29802",
                        "DICAMBA, POT. SALT = 129043"),
            after   = "DICAMBA, SODIUM SALT = 29806") %>%
  fmt_number(columns = everything(), 
             use_seps = TRUE,
             decimals = 0) %>%
  tab_spanner(label = md("**DICAMBA**"), columns = c("DICAMBA = 29801",
                                             "DICAMBA, DIGLY. SALT = 128931",
                                             "DICAMBA, SODIUM SALT = 29806", 
                                             "DICAMBA, DIMET. SALT = 29802",  
                                             "DICAMBA, POT. SALT = 129043")) %>%
  tab_spanner(label = md("**GLYPHOSATE**"), columns = c("GLYPHOSATE = 417300",
                                                "GLYPHOSATE DIM. SALT = 103608",
                                                "GLYPHOSATE ISO. SALT = 103601",
                                                "GLYPHOSATE POT. SALT = 103613"))%>%
  tab_source_note(source_note = "Source: USDA NASS") %>%
# adding color to certain cells!  
  tab_style(style = list(
                      cell_fill(color = "darkgreen"),
                      cell_text(color = "white")),
            locations = cells_body(columns = "DICAMBA, DIGLY. SALT = 128931",
                                   rows    = Year == 2018)) %>%
  tab_style(style = list(
                      cell_fill(color = "darkred"),
                      cell_text(color = "white")),
            locations = cells_body(columns = "DICAMBA = 29801",
                                   rows    = Year < 2005 & Year > 1989)) %>%
  opt_row_striping(row_striping = FALSE)

Herbicide Use on Cornfields in Iowa
Total pounds of applied herbicide
Year	DICAMBA					GLYPHOSATE
Year	DICAMBA = 29801	DICAMBA, DIGLY. SALT = 128931	DICAMBA, SODIUM SALT = 29806	DICAMBA, DIMET. SALT = 29802	DICAMBA, POT. SALT = 129043	GLYPHOSATE = 417300	GLYPHOSATE DIM. SALT = 103608	GLYPHOSATE ISO. SALT = 103601	GLYPHOSATE POT. SALT = 103613
2018	0	227,000	96,000	0	0	1,831,000	727,000	2,678,000	4,523,000
2014	0	0	91,000	0	0	1,530,000	1,058,000	3,500,000	3,959,000
2010	0	0	11,000	0	0	519,000	0	8,871,000	0
2005	0	0	18,000	0	146,000	0	0	2,230,000	0
2000	540,000	0	0	23,000	425,000	479,000	0	0	0
1995	1,373,000	0	0	0	0	158,000	0	0	0
1990	833,000	0	0	0	0	0	0	0	0
Source: USDA NASS

Oy! This is getting messy now. I won’t pretend to understand the differences between these herbicide formulations. I did, however, find this article helpful. As expected, one formulation of Dicamba (highlighted in dark red) has been phased out of usage, while another formulation (highlighted in dark green) has been phased in quite recently. The article I just linked confirms that the newer formulations of Dicamba (labeled commercially as “Clarity” or “Xtend”) have heavier salts added to them (“DIGLY. SALT” = diglycolamine salt) so that the herbicide doesn’t volatilize as easily and cause damage by drift. The point being - looking at formulations can be confusing and insightful. It also gave me a good excuse to try out highlighting different cells.

Alright let’s wrap up this post by looking at some basic corn stats:

acre_dat <- 
  read_csv("acres_data.csv") %>%
  select(Year, Period, `Data Item`, Value) %>%
  filter(Period == "YEAR" | Period == "MARKETING YEAR") %>%
  pivot_wider(id_cols = Year, names_from = `Data Item`, values_from = Value) %>%
  janitor::clean_names()  # should have done this on the last dataset...

corn_url <- "https://www.netclipart.com/pp/m/7-70514_corn-clipart-single-vegetable-corn-clipart-png.png"

last_tab <- 
  gt(acre_dat, rowname_col = "year") %>%
  cols_label(
    corn_grain_price_received_measured_in_bu = "Price",
    corn_acres_planted = "Acres planted",
    corn_grain_production_measured_in = "Total production",
    corn_grain_yield_measured_in_bu_acre = "Yield (bu/acre)"
  ) %>%
  cols_move_to_end(columns = "corn_grain_price_received_measured_in_bu") %>%
  tab_header(title = html("<strong>Acreage, Yield, and Price of Iowa Corn<strong>",
                          web_image(url = corn_url))) %>%
  tab_source_note(source_note = "Source: USDA NASS") %>%
  tab_stubhead(label = "Year") %>%
  tab_footnote(footnote = "per bushel (equivalent to 64 pints)",
               locations = cells_column_labels("corn_grain_price_received_measured_in_bu")) %>%
  fmt_currency(columns = vars(corn_grain_price_received_measured_in_bu,
                              corn_grain_production_measured_in),
               currency = "USD",
               decimals = 2) %>%
  fmt_number(columns = vars(corn_acres_planted),
             use_seps = TRUE,
             decimals = 0) %>%
  grand_summary_rows(columns = vars(corn_grain_price_received_measured_in_bu,
                                    corn_acres_planted,
                                    corn_grain_production_measured_in,
                                    corn_grain_yield_measured_in_bu_acre),
                     fns = list(Average = ~mean(.))) %>%
  opt_table_outline(style = "solid", color = "darkgreen") %>%
  opt_row_striping(row_striping = FALSE)

last_tab

Acreage, Yield, and Price of Iowa Corn

Year	Acres planted	Total production	Yield (bu/acre)	Price¹
2018	13,200,000	$8,971,410,000.00	196	$3.59
2014	13,700,000	$8,783,054,000.00	178	$3.71
2010	13,400,000	$11,261,498,000.00	165	$5.23
2005	12,800,000	$4,195,250,000.00	173	$1.94
2000	12,300,000	$3,024,000,000.00	144	$1.75
1995	11,900,000	$4,565,760,000.00	123	$3.20
1990	12,800,000	$3,452,904,000.00	126	$2.21
Average	12,871,428.57	6,321,982,285.71	157.86	3.09
Source: USDA NASS
¹ per bushel (equivalent to 64 pints)

New features in this table:

I added a darkgreen border to the whole table
I added a grand summary line at the bottom that takes the average of all columns. I couldn’t figure out how to get some of the columns with dollar signs and some without…?
I added a tiny corn image to the title! Which looked great when I saved the table as png (as you can see from image in this post), but then gets all messed up when rendered via blogdown.

Btw, tables are pretty easy to save: # gtsave(last_tab, filename = "corn_table.png")

Again, let’s just take a quick moment to look at these numbers. Corn acreage has increased since 1990, peaking in 2014. Prices per bushel and total production ($) are highest in 2010 (thanks renewable fuel standard!), so it makes sense we’d see a delayed boom in production acres in 2014. Yield continues to steadily increase.

Lydia English

Interested in ecology, sustainable agriculture, and data curation and visualization.

Making tables with the R package gt

Making tables with the R package gt

Preamble

Tables

Lydia English

Related