This is the script of the Chapter 19 of the book: Gómez-Corona, Carlos & Rodrigues, Heber. (2023). Consumer Research Methods in Food Science. 10.1007/978-1-0716-3000-6.
Cite this chapter as:
Bécue-Bertaut, M., Álvarez-Esteban, R., Canals, JM. (2023). Use of Lexicometry in Sensometrics, an Essential Complement to Holistic Methods an Original Methodology. In: Gómez-Corona, C., Rodrigues, H. (eds) Consumer Research Methods in Food Science. Methods and Protocols in Food Science . Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3000-6_19
4. The database
Download the database
There are two versions of the database, one in csv format (UTF) and one in RData format. First of all, we must download the WinesFrCat data file from the following internet URL and save it in our computer:
- csv format (UTF). http://www.xplortext.org/Rdata/WinesFrCat.csv
- RData (UTF). http://www.xplortext.org/Rdata/WinesFrCat.RData
It may be convenient to assign a working directory to store the figures using setwd().
Load RData file in R
The database will be loaded in csv format:
base <- read.csv2("WinesFrCat.csv", ";", header=TRUE, dec=",", row.names=1)
Or directly from the web address:
base <- read.csv2('http://www.xplortext.org/Rdata/WinesFrCat.csv', sep=";", header=TRUE, dec=",", row.names=1)
In RData format:
base <- load("WinesFrCat.RData")
Or directly from the web address:
load(url('http://www.xplortext.org/Rdata/WinesFrCat.RData'))
It may be convenient to assign a working directory to store the figures using setwd().
Database description
Eight Catalan wines are considered, selected according to a full factorial design involving three factors: designation of origin (DO; Priorat or Empordà, both in Catalonia, abbreviated to P and E), dominant grape variety (Grenache or Carignan, abbreviated to G and C), and production year (2005 or 2006, abbreviated to 05 and 06):
rownames(base)
"PG05" "PG06" "EG05" "EG06" "PC05" "PC06" "EC05" "EC06"
Eight observations in rows (wines) and 26 variables in columns (24 textual descriptions and two average scores of the judges -French and Catalan- to each of the 8 wines).
names(base)
"CE1" "CE2" "CE3" "CE4" "FE5" "FE6" "CE7" "CE8" "CE9" "CE10" "CE11" "FE12" "FP1" "FP3" "FP4" "FP5" "FP6" "FP7" "FP8"
"FP9" "FP10" "FP11" "FP12" "FP2" "FrScore" "CatScore"
dim(base)
8 26
Two databases are constructed, one with the descriptions of French judges and the other with the descriptions of Catalan judges.
Building a dataframe with the 15 French judges
baseFr <- base[,c(5,6,12:24)]
str.name.JFr <- names(baseFr)
str.name.JFr
"FE5" "FE6" "FE12" "FP1" "FP3" "FP4" "FP5" "FP6" "FP7" "FP8" "FP9" "FP10" "FP11" "FP12" "FP2"
Another way to buid the dataframe with the 15 French judges
str.name.JFr <- c("FE5", "FE6", "FE12", "FP1", "FP3", "FP4", "FP5", "FP6", "FP7", "FP8", "FP9", "FP10", "FP11", "FP12", "FP2")
ncol(baseFr) # 15 French judges
15
Building a dataframe with the 9 Catalan judges
baseCat <- base[,c(1:4,7:11)]
str.name.JCat <- names(baseCat)
Another way
str.name.JCat <- c("CE1", "CE2", "CE3", "CE4", "CE7", "CE8", "CE9", "CE10", "CE11")
ncol(baseCat) # 9 Catalan judges
9
names(baseCat)
"CE1" "CE2" "CE3" "CE4" "CE7" "CE8" "CE9" "CE10" "CE11"
4.1. Some Lexical Features of both Corpus
4.1.1. Table 1. Excerpt of the free comments
To build the table:
df1 <- data.frame(matrix(NA, nrow = 6, ncol = 1)) # Create empty data frame
Selecting French judge FE5 and comments on wines PG06, EG05 and EG06:
df1[1,1] <- ("--- French comments")
df1[2,1] <- paste0(baseFr[rownames(baseFr)=="PG06", "FE5" ], " (judge FE5 / wine PG06)")
w2 <- baseFr[rownames(baseFr)=="EG05", "FE5" ]
df1[3,1] <- paste0(w2, " (judge FE5 / wines EG05 and EG06)")
Selecting Catalan judge FE5 and comments on wines PG06, EG05 and EG06:
df1[4,1] <- ("--- Catalan comments")
w5 <- baseCat[rownames(baseCat)=="EG06", "CE1" ]
df1[5,1] <- paste0(w5, " (judge CE1 / wines EG06 and PG06)")
w6 <- baseCat[rownames(baseCat)=="EG05", "CE2" ]
df1[6,1] <- paste0(w6, " (judge CE2 / wines EG05 and EG06)")
Building the table with kableExtra package Table 1. Excerpt of the free comments