Script Use of Lexicometry in Sensometrics

This is the script of the Chapter 19 of the book: Gómez-Corona, Carlos & Rodrigues, Heber. (2023). Consumer Research Methods in Food Science. 10.1007/978-1-0716-3000-6.

Cite this chapter as:

Bécue-Bertaut, M., Álvarez-Esteban, R., Canals, JM. (2023). Use of Lexicometry in Sensometrics, an Essential Complement to Holistic Methods an Original Methodology. In: Gómez-Corona, C., Rodrigues, H. (eds) Consumer Research Methods in Food Science. Methods and Protocols in Food Science . Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3000-6_19

4. The database

Download the database

There are two versions of the database, one in csv format (UTF) and one in RData format. First of all, we must download the WinesFrCat data file from the following internet URL and save it in our computer:

It may be convenient to assign a working directory to store the figures using setwd().

Load RData file in R

The database will be loaded in csv format:

base <- read.csv2("WinesFrCat.csv", ";", header=TRUE, dec=",", row.names=1)

 

Or directly from the web address:

base <- read.csv2('http://www.xplortext.org/Rdata/WinesFrCat.csv', sep=";", header=TRUE, dec=",", row.names=1)

 

In RData format:

base <- load("WinesFrCat.RData")

 

Or directly from the web address:

load(url('http://www.xplortext.org/Rdata/WinesFrCat.RData'))

 

It may be convenient to assign a working directory to store the figures using setwd().

 

Database description

Eight Catalan wines are considered, selected according to a full factorial design involving three factors: designation of origin (DO; Priorat or Empordà, both in Catalonia, abbreviated to P and E), dominant grape variety (Grenache or Carignan, abbreviated to G and C), and production year (2005 or 2006, abbreviated to 05 and 06):

rownames(base)

Eight observations in rows (wines) and 26 variables in columns (24 textual descriptions and two average scores of the judges -French and Catalan- to each of the 8 wines).

names(base)

dim(base)

Two databases are constructed, one with the descriptions of French judges and the other with the descriptions of Catalan judges.

Building a dataframe with the 15 French judges

baseFr <- base[,c(5,6,12:24)]
str.name.JFr <- names(baseFr)
str.name.JFr

Another way to buid the dataframe with the 15 French judges

str.name.JFr <- c("FE5", "FE6", "FE12", "FP1", "FP3", "FP4", "FP5", "FP6", "FP7", "FP8", "FP9", "FP10", "FP11", "FP12", "FP2")
ncol(baseFr) # 15 French judges

Building a dataframe with the 9 Catalan judges

baseCat <- base[,c(1:4,7:11)]
str.name.JCat <- names(baseCat)

 

Another way

str.name.JCat <- c("CE1", "CE2", "CE3", "CE4", "CE7", "CE8", "CE9", "CE10", "CE11")
ncol(baseCat) # 9 Catalan judges

names(baseCat)

4.1. Some Lexical Features of both Corpus

4.1.1. Table 1. Excerpt of the free comments

To build the table:

df1 <- data.frame(matrix(NA, nrow = 6, ncol = 1)) # Create empty data frame

 

Selecting French judge FE5 and comments on wines PG06, EG05 and EG06:

df1[1,1] <- ("--- French comments")
df1[2,1] <- paste0(baseFr[rownames(baseFr)=="PG06", "FE5" ], " (judge FE5 / wine PG06)")
w2 <- baseFr[rownames(baseFr)=="EG05", "FE5" ]
df1[3,1] <- paste0(w2, " (judge FE5 / wines EG05 and EG06)")

 

Selecting Catalan judge FE5 and comments on wines PG06, EG05 and EG06:

df1[4,1] <- ("--- Catalan comments")
w5 <- baseCat[rownames(baseCat)=="EG06", "CE1" ]
df1[5,1] <- paste0(w5, " (judge CE1 / wines EG06 and PG06)")
w6 <- baseCat[rownames(baseCat)=="EG05", "CE2" ]
df1[6,1] <- paste0(w6, " (judge CE2 / wines EG05 and EG06)")

 

Building the table with kableExtra package Table 1. Excerpt of the free comments

library(magrittr)
library(kableExtra)
kableExtra::kable(df1, col.names = " ", caption = "<left><strong>Table 1. Excerpt of the free comments</strong></left>") %>%
kable_classic(full_width = FALSE, html_font = "Cambria") %>%
row_spec((1:nrow(df1)), italic = TRUE) %>%
row_spec(c(1,4), bold= TRUE) %>%
row_spec(seq(2,nrow(df1),2), background="#CCFFFF")