Script Use of Lexicometry in Sensometrics

4.1.3. Most frequent words. Tables and plots

French panel

All the French words ordered by frequency. Two ways.

res.TD.Fr.Before <-TextData(baseFr,var.text=c(1:ncol(baseFr)), stop.word.user=str.Fr.stopworduser,Fmin=1)
res.TD.Fr.Before$indexW

 

summary(res.TD.Fr.Before, ndoc=0, nword=Inf, info=FALSE)

 

Static ggplot barchart with frequency of words before stopwords

plot(res.TD.Fr.Before, nword=15, sel="word", col.fill="#CC0000", interact=FALSE,
title="Most frequent French words")

Interactive (plotly) barchart with frequency of words, % of the word before and after stopwords:

 

plot(res.TD.Fr.Before, nword=15, sel="word", col.fill="#CC0000", interact=TRUE, title="Most frequent French words")

 

To translate the name of 15 most frequent Frech words.

- Building a copy of res.TD.Fr.Before object and creating a vector (original.Fr) with the 15 most frequent French words.

res.Fr.Trans <- res.TD.Fr.Before
original.Fr <- rownames(res.TD.Fr.Before$indexW[1:15,])
cat(original.Fr)

 

- Creating a vector translation.Fr with the words in English with the same order than original.Fr

translation.Fr <- c("boisé (woody)", "fruit (fruit)", "fruité (fruity)", "tanin (tannin)", "puissant (powerful)", "tannique (tannic)", "mûr (mature/ripe)", "équilibré (balanced)", "très (very)", "vin (wine)", "bois (wood)", "vanillé (vanillin)", "animal (animal)","épice (spicy)", "nonboisé (unwooded)")

 

- Creating a data frame with the original words and translation:

df.Fr15Change <- data.frame(original.Fr, translation.Fr)
df.Fr15Change

- To change French DocTerm object (only for the 15 most frequent words)

res.Fr.Trans$DocTerm$dimnames$Terms[match(df.Fr15Change$original.Fr , res.Fr.Trans$DocTerm$dimnames$Terms)] <- df.Fr15Change$translation.Fr

 

- To change indexW with the frequencies (only for the 15 most frequent words)

rownames(res.Fr.Trans$indexW)[match(df.Fr15Change$original.Fr , rownames(res.Fr.Trans$indexW))] <- df.Fr15Change$translation.Fr
res.Fr.Trans$indexW[1:15,]

 

- Other way to check changes

summary(res.Fr.Trans, ndoc=0, nword=15, info=FALSE)

 

- Building a dataframe with the frequency of French words. Two ways:

df.FrW <- data.frame(res.Fr.Trans$indexW[1:15,]) df.FrW <- data.frame(rownames(df.FrW), df.FrW) df.FrW

 

- Building the table

row.names(df.FrW) <- NULL colnames(df.FrW) <- c("Words", "Count", "No.docs") df.FrW

 

Table 2.a. Most frequent French words

kableExtra::kable(df.FrW,
caption = "<left><strong>Table 2.a. Most frequent French words</strong></left>") %>%
column_spec(1, bold = T) %>% kable_classic(full_width = F, html_font = "Cambria") %>%
row_spec(seq(2,nrow(df.FrW),2), background="#CCFFFF")