Chapter 4.b.
1. script.ENchap4.b.R
script.ENchap4.b.R can be downloaded in UTF-8 format from:
Download the database
Download Aspiration_Int_UK.RData in UTF-8 format data from the Internet and save it to some directory:
2. Loading Xplortext package
1 |
library(Xplortext) |
1 2 3 4 5 6 7 8 |
Loading required package: FactoMineR Loading required package: ggplot2 Loading required package: tm Loading required package: NLP Attaching package: 'NLP' The following object is masked from 'package:ggplot2': annotate |
3. Loading the database
1 2 |
load("Aspiration_Int_UK_En.RData") |
4. Building the LT
1 2 3 |
stopwu<-c("anything","nothing","else","can","think","s") res.TD<-TextData(base_UK,var.text=c(9,10), stop.word.tm=TRUE,stop.word.user=stopwu,Dmin=15) summary(res.TD,ndoc=0,nword=20) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
TextData summary Before After Documents 1043.00 1039.00 Occurrences 13917.00 5026.00 Words 1334.00 78.00 Mean-length 13.34 4.84 NonEmpty.Docs 1040.00 1039.00 NonEmpty.Mean-length 13.38 4.84 Index of the 20 most frequent words Word Frequency N.Documents 1 family 705 624 2 health 612 555 3 good 303 254 4 happiness 229 218 5 money 172 169 6 life 161 149 7 job 143 137 8 happy 137 123 9 children 131 129 10 work 118 113 11 friends 116 111 12 husband 96 92 13 home 90 89 14 live 84 76 15 peace 79 75 16 wife 76 72 17 living 68 66 18 enough 68 61 19 people 63 55 20 able 56 51 |
5. Correspondence analysis applied to the LT
First CA, neither contextual variables nor segments are teken in to account
1 2 |
res.LexCA<-LexCA(res.TD, graph=FALSE,ncp=10,lmw=2,lmd=2) summary(res.LexCA,ncp=10,ndoc=0,nword=0,nsup=0,metaWords=T) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
Correspondence analysis summary Eigenvalues Variance % of var. Cumulative % of var. dim 1 0.393 2.604 2.604 dim 2 0.376 2.487 5.092 dim 3 0.339 2.248 7.339 dim 4 0.334 2.215 9.554 dim 5 0.326 2.162 11.716 dim 6 0.311 2.062 13.778 dim 7 0.309 2.043 15.821 dim 8 0.299 1.978 17.799 dim 9 0.291 1.929 19.727 dim 10 0.288 1.906 21.634 Cramer's V 0.443 Inertia 15.101 Words whose contribution is over 2 times the average word contribution Dimension 1 + people healthy want like see go happy getting live Dimension 1 - health happiness family Dimension 2 + peace mind love freedom want people Dimension 2 - healthy Dimension 3 + time leisure work dog Dimension 3 - enough money happiness live comfortably healthy Dimension 4 + living peace mind standard wife children Dimension 4 - money leisure time enough work job Dimension 5 + mind peace leisure living work standard time Dimension 5 - love wife music daughter Dimension 6 + living standard go church education freedom Dimension 6 - son happy daughter see mind Dimension 7 + love church friends people happy others Dimension 7 - freedom wife want Dimension 8 + keeping job house suppose contentment well getting Dimension 8 - church happy others Dimension 9 + life social others future keeping help Dimension 9 - living suppose home standard husband nice content Dimension 10 + freedom healthy friends personal want really music just Dimension 10 - food people children grandchildren getting keeping |
Word representation: only the metakeys
1 |
ellipseLexCA(res.LexCA,selWord="meta 2",selDoc=NULL,col.word="black",cex=1.2,title="Meta-key representation") |
Repeated segment visualization: process chain
1 2 3 |
res.TD<-TextData (base_UK,var.text=c(9,10),stop.word.tm=TRUE, stop.word.user=stopwu, Dmin=15, segment=TRUE, seg.nfreq=3, seg.nfreq2=200, seg.nfreq3=3, graph=FALSE) summary(res.TD,nword=20, nseg=500, ordFreq = FALSE) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 |
TextData summary Before After Documents 1043.00 1039.00 Occurrences 13917.00 5026.00 Words 1334.00 78.00 Mean-length 13.34 4.84 NonEmpty.Docs 1040.00 1039.00 NonEmpty.Mean-length 13.38 4.84 Statistics for the 10 first docs DocName Occurrences DistinctWords PctLength Mean Length100 Occurrences DistinctWords PctLength Mean Length100 before before before before after after after after 1 1 3 3 0.02 22.48 3 3 0.06 62.26 2 2 17 15 0.12 127.41 6 6 0.12 124.51 3 3 2 2 0.01 14.99 2 2 0.04 41.50 4 4 4 4 0.03 29.98 4 4 0.08 83.01 5 5 13 10 0.09 97.43 6 5 0.12 124.51 6 6 10 10 0.07 74.94 6 6 0.12 124.51 7 7 2 2 0.01 14.99 2 2 0.04 41.50 8 8 26 23 0.19 194.86 8 8 0.16 166.02 9 9 8 7 0.06 59.96 5 4 0.10 103.76 10 10 8 8 0.06 59.96 4 4 0.08 83.01 Index of the first 20 words in alphabetical order Frequency N.Documents able 56 51 car 18 18 children 131 129 church 16 15 comfortable 20 20 comfortably 17 17 content 18 16 contentment 31 30 daughter 21 21 dog 16 15 education 25 24 employment 20 20 enjoy 17 17 enough 68 61 everything 16 16 family 705 624 food 23 22 freedom 38 35 friends 116 111 future 17 17 Number of repeated segments 349 Index of the repeated segments in alphabetical order Segment Frequency Long 1 a good family 4 3 2 a good home 3 3 3 a good job 6 3 4 a good life 8 3 5 a good standard of living 6 5 6 a great deal 3 3 7 a happy family life 3 4 8 a healthy life 3 3 9 a job and 6 3 10 a nice home 7 3 11 a nice house 3 3 12 able to afford 3 3 13 able to get 7 3 14 able to go 4 3 15 able to have 3 3 16 able to live 9 3 17 able to live comfortably 5 4 18 able to look after 3 4 19 able to walk 3 3 20 all happy and 3 3 21 all i have 3 3 22 all of us 3 3 23 all the family 6 3 24 and enough money to live 3 5 25 and family life 3 3 26 and have a 5 3 27 and having a 3 3 28 and my family 10 3 29 and my family s 4 4 30 and my husband 3 3 31 and their children 3 3 32 are happy and 3 3 33 are very important 4 3 34 as i am 3 3 35 as i can 5 3 36 as it comes 4 3 37 as long as 11 3 38 as long as i 3 4 39 as much as 5 3 40 as we are 4 3 41 at the moment 7 3 42 be able to 26 3 43 be able to live 5 4 44 be able to live comfortably 3 5 45 be happy and 6 3 46 being able to 23 3 47 being able to get 3 4 48 being able to go out 3 5 49 being able to live 4 4 50 being happy and 6 3 51 can not think 11 3 52 can not think of anything 9 5 53 can not think of anything else 7 6 54 children and grandchildren 3 3 55 children and husband 3 3 56 children are happy 4 3 57 children have a good 3 4 58 day to day 5 3 59 do not have 3 3 60 do not know 9 3 61 do not really 4 3 62 do not want 4 3 63 enough money to 34 3 64 enough money to keep 5 4 65 enough money to live 16 4 66 enough money to live comfortably 4 5 67 enough money to live on 9 5 68 enough to eat 5 3 69 family and friends 15 3 70 family health and happiness 3 4 71 family i suppose 4 3 72 family my job 5 3 73 family nothing else 3 3 74 family peace of mind 3 4 75 family s happiness 4 3 76 family s health 8 3 77 family that is 6 3 78 family that is all 4 4 79 family they are 3 3 80 for all the 4 3 81 for me and my family 3 5 82 for my children 3 3 83 for my family 4 3 84 for myself and 3 3 85 for the family 4 3 86 freedom of speech 4 3 87 freedom to do 4 3 88 friends and family 4 3 89 from day to day 4 4 90 get a good 4 3 91 get a good job 3 4 92 get on in 3 3 93 getting a job 4 3 94 getting into debt 3 3 95 getting on with 4 3 96 good family life 5 3 97 good health and 8 3 98 good health for 9 3 99 good health for all 4 4 100 good health happiness 7 3 101 good health that 3 3 102 good job good 3 3 103 good relationship with 3 3 104 good social life 4 3 105 good standard of living 10 4 106 happiness enough money 3 3 107 happiness of my family 4 4 108 happy and content 4 3 109 happy and healthy 4 3 110 happy family life 13 3 111 have a good 10 3 112 have a job 3 3 113 have enough money 5 3 114 have enough money to 3 4 115 have good health 5 3 116 having a good 5 3 117 having a happy 4 3 118 having a job 4 3 119 having enough money 13 3 120 having enough money to 8 4 121 having enough money to live 4 5 122 health and family 6 3 123 health and happiness 18 3 124 health and my 4 3 125 health being able to 3 4 126 health for all 6 3 127 health good health 5 3 128 health i suppose 6 3 129 health my family 3 3 130 health nothing else 5 3 131 health of family 5 3 132 health of my 7 3 133 health of my family 5 4 134 health of the family 4 4 135 health that is 5 3 136 health that is all 3 4 137 home and family 3 3 138 husband and children 5 3 139 husband and family 5 3 140 husband s health 4 3 141 i am a 5 3 142 i am not 3 3 143 i can not 9 3 144 i can not really think of 3 6 145 i can not think 5 4 146 i can not think of anything else 4 7 147 i can think of 5 4 148 i do not 21 3 149 i do not know 6 4 150 i do not really think 3 5 151 i do not think 4 4 152 i have a 3 3 153 i have got 3 3 154 i like to 10 3 155 i think that 4 3 156 i want to 3 3 157 i wish i 3 3 158 i would like 7 3 159 i would like to 6 4 160 if you are 3 3 161 important to me 5 3 162 in a job 3 3 163 in good health 9 3 164 in the family 4 3 165 in the world 11 3 166 is all i 5 3 167 is important to me 3 4 168 is not a 3 3 169 job my family 3 3 170 job nothing else 3 3 171 just to be 4 3 172 just to live 4 3 173 keep in good health 3 4 174 keeping my job 4 3 175 law and order 9 3 176 life good health 3 3 177 life that is 4 3 178 life to be 3 3 179 like to have 4 3 180 like to see 5 3 181 living in a 5 3 182 look after my 3 3 183 looking after my 3 3 184 me and my 5 3 185 me and my family 4 4 186 money do not 3 3 187 money to be able to 4 5 188 money to buy 3 3 189 money to do 4 3 190 money to live 18 3 191 money to live on 10 4 192 more money to 3 3 193 my children and 5 3 194 my children my 6 3 195 my daughter and 3 3 196 my daughters and 3 3 197 my dog my 3 3 198 my family 236 2 199 my family and 12 3 200 my family are 3 3 201 my family health 3 3 202 my family i 7 3 203 my family i suppose 3 4 204 my family my 27 3 205 my family my children 3 4 206 my family my health 5 4 207 my family nothing 3 3 208 my family s 10 3 209 my family s health 6 4 210 my family that is 4 4 211 my family that is all 3 5 212 my family to 3 3 213 my friends my 5 3 214 my grandchildren my 3 3 215 my health and 6 3 216 my health and my family s health 3 7 217 my health i 5 3 218 my health my 4 3 219 my home and 3 3 220 my home my 7 3 221 my home my friends 3 4 222 my house my 4 3 223 my husband and 12 3 224 my husband and children 3 4 225 my husband and family 4 4 226 my husband my 7 3 227 my husband my family 4 4 228 my husband s 6 3 229 my husband s health 3 4 230 my job and 5 3 231 my job money 3 3 232 my job my 8 3 233 my job my friends 3 4 234 my own home 3 3 235 my son and 4 3 236 my wife and 10 3 237 my wife and family 5 4 238 my wife s 4 3 239 my work my 6 3 240 no money worries 4 3 241 not a lot 3 3 242 not having to 3 3 243 not really think 6 3 244 not think of anything 10 4 245 not think of anything else 8 5 246 nothing else i 4 3 247 of my family 21 3 248 of the children 3 3 249 of the family 8 3 250 of the world 4 3 251 out and about 3 3 252 peace in the world 6 4 253 peace of mind 42 3 254 peace of mind and 3 4 255 peace of mind good 4 4 256 people in general 3 3 257 place to live 3 3 258 quality of life 8 3 259 relationship with my 3 3 260 rest of my 3 3 261 s health and 5 3 262 should be happy 3 3 263 so i can 3 3 264 social life family 3 3 265 standard of living 30 3 266 state of the 4 3 267 sufficient money to 5 3 268 that i am 6 3 269 that is about 11 3 270 that is about it 9 4 271 that is all 33 3 272 that is all i 3 4 273 that is important 3 3 274 that is it 12 3 275 that my family 4 3 276 that my husband 3 3 277 that the children 5 3 278 that the family 4 3 279 that they are 4 3 280 that we all 3 3 281 the family and 4 3 282 the family is 4 3 283 the family that 4 3 284 the health of 6 3 285 the health of my family 3 5 286 the quality of 3 3 287 the rest of 4 3 288 the state of the 3 4 289 their health and 4 3 290 their health and happiness 3 4 291 there is a 3 3 292 there is not 4 3 293 they are alright 3 3 294 think of anything 12 3 295 think of anything else 10 4 296 to be able 25 3 297 to be able to 24 4 298 to be able to get 4 5 299 to be able to live 3 5 300 to be content 4 3 301 to be happy 32 3 302 to be happy and 3 4 303 to be happy to have 3 5 304 to be healthy 5 3 305 to carry on 5 3 306 to do what 8 3 307 to do what i 6 4 308 to do what i want 4 5 309 to enjoy life 3 3 310 to get a 6 3 311 to get a good 3 4 312 to get on 4 3 313 to go out 6 3 314 to have a 12 3 315 to have a good 3 4 316 to have friends 3 3 317 to have good 6 3 318 to have good health 4 4 319 to have my 3 3 320 to help others 3 3 321 to keep going 3 3 322 to keep well 3 3 323 to live a 4 3 324 to live comfortably 11 3 325 to live happily 3 3 326 to live in 4 3 327 to live on 15 3 328 to look after 5 3 329 to pay the 3 3 330 to see my 5 3 331 to see the 3 3 332 want to do 4 3 333 we are all 3 3 334 we do not 4 3 335 wealth and happiness 3 3 336 welfare of my 6 3 337 welfare of my family 4 4 338 wellbeing of family 4 3 339 wellbeing of my 3 3 340 what i want 5 3 341 when i want 4 3 342 wife and family 7 3 343 wife s health 6 3 344 with my husband 3 3 345 with the family 7 3 346 would like to 8 3 347 would like to see 4 4 348 you have got 4 3 349 you want to do 3 4 |
Correspondence analysis computing the segments coordinates
Second CA, equal to the former but segments are taken into account
1 |
res.LexCA <- LexCA(res.TD, graph=FALSE,segment=TRUE) |
Document representation
1 2 |
plot(res.LexCA,selWord=NULL,col.doc="gray30",title="Document representation",axes=c(1,2)) |
Segment representation
Only the second plot is reproduced
The first one is used to select those segments to visualize in the second plot
1 2 |
plot(res.LexCA,selWord=NULL,selDoc=NULL, selSeg="cos2 .02",cex=0.8, autoLab="yes", title="The best represented repeated segments") |
1 2 |
plot(res.LexCA,selWord=NULL,selDoc=NULL, selSeg=c(16,46,66,70,88,91,107,138,229,252,253,301,316,335,338,340,349), col.seg="black", autoLab="yes",cex=0.9, title="Selected supplementary repeated segments") |
In order to take into account contextual variables, TextData is runned again
1 2 3 4 |
res.TD<-TextData(base_UK,var.text=c(9,10), stop.word.tm=TRUE,stop.word.user=stopwu,Dmin=15, context.quali=c("Gender","Education","Age_Educ","Gender_Educ"), context.quanti=c("Age"),graph=FALSE) summary(res.TD,ndoc=0,nword=20) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
TextData summary Before After Documents 1043.00 1039.00 Occurrences 13917.00 5026.00 Words 1334.00 78.00 Mean-length 13.34 4.84 NonEmpty.Docs 1040.00 1039.00 NonEmpty.Mean-length 13.38 4.84 Index of the 20 most frequent words Word Frequency N.Documents 1 family 705 624 2 health 612 555 3 good 303 254 4 happiness 229 218 5 money 172 169 6 life 161 149 7 job 143 137 8 happy 137 123 9 children 131 129 10 work 118 113 11 friends 116 111 12 husband 96 92 13 home 90 89 14 live 84 76 15 peace 79 75 16 wife 76 72 17 living 68 66 18 enough 68 61 19 people 63 55 20 able 56 51 Summary of the contextual categorical variables Gender Education Age_Educ Gender_Educ Man :494 E_Low :478 >55_Low :235 M_EduLow :221 Woman:545 E_Medium:418 31_55_Low :225 M_EduMed :197 E_High :143 <=30_Medium :187 M_EduHigh: 76 31_55_Medium:159 W_EduLow :257 >55_Medium : 72 W_EduMed :221 <=30_High : 61 W_EduHigh: 67 (Other) :100 Summary of the contextual quantitative variables Age Min. :18.00 1st Qu.:30.00 Median :44.00 Mean :45.82 3rd Qu.:61.00 Max. :90.00 |
1 |
summary(res.TD$context$quali,maxsum=20) |
1 2 3 4 5 6 7 8 9 10 |
Gender Education Age_Educ Gender_Educ Man :494 E_Low :478 <=30_Low : 18 M_EduLow :221 Woman:545 E_Medium:418 <=30_Medium :187 M_EduMed :197 E_High :143 <=30_High : 61 M_EduHigh: 76 31_55_Low :225 W_EduLow :257 31_55_Medium:159 W_EduMed :221 31_55_High : 60 W_EduHigh: 67 >55_Low :235 >55_Medium : 72 >55_High : 22 |
1 |
<span style="background-color: #f4f4f4; font-size: 12.8px;">res.LexCA<-LexCA(res.TD, graph=FALSE,ncp=10,lmd=2,lmw=2)</span> |
Education and Age_Education categories representation. Trajectories
1 2 3 4 5 6 |
plot(res.LexCA,selDoc=NULL,selWord=NULL,quali.sup=c("Age_Educ","Education"), col.quali.sup="grey50", xlim=c(-0.5,+0.5),ylim=c(-0.5,+0.5),title="Education and Age_Education categories") lines(res.LexCA$quali.sup$coord[3:5,1],res.LexCA$quali.sup$coord[3:5,2],lwd=4,col="grey40") lines(res.LexCA$quali.sup$coord[6:8,1],res.LexCA$quali.sup$coord[6:8,2],lwd=2,col="grey20") lines(res.LexCA$quali.sup$coord[9:11,1],res.LexCA$quali.sup$coord[9:11,2],lwd=2,col="grey20") lines(res.LexCA$quali.sup$coord[12:14,1],res.LexCA$quali.sup$coord[12:14,2],lwd=2,col="grey20") |
Gender and Gender_Education categories representation. Trajectories
1 2 3 4 5 |
plot(res.LexCA,selDoc=NULL,selWord=NULL,quali.sup=c("Gender","Gender_Educ"), col.quali.sup="grey50", xlim=c(-0.5,+0.5),ylim=c(-0.5,+0.5),title="Gender and Gender_Education categories") lines(res.LexCA$quali.sup$coord[15:17,1],res.LexCA$quali.sup$coord[15:17,2],lwd=2,col="grey20") lines(res.LexCA$quali.sup$coord[18:20,1],res.LexCA$quali.sup$coord[18:20,2],lwd=2,col="grey20") |
Test values on the position of the represented categories
1 |
round(res.LexCA$quali.sup$v.test[,1:5],3) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 Gender.Man 0.020 -0.556 1.998 -2.142 -0.078 Gender.Woman -0.020 0.556 -1.998 2.142 0.078 Education.E_Low 3.232 -6.013 -2.041 2.917 -2.299 Education.E_Medium -2.932 1.765 -0.448 -2.547 2.150 Education.E_High -0.513 6.167 3.574 -0.602 0.273 Age_Educ.<=30_Low 0.500 -1.948 -0.302 -0.078 3.391 Age_Educ.<=30_Medium -0.116 -1.557 0.189 -5.511 2.061 Age_Educ.<=30_High 1.426 2.964 2.208 -1.863 -0.065 Age_Educ.31_55_Low -1.536 -4.441 0.179 1.206 -0.227 Age_Educ.31_55_Medium -2.474 3.246 0.104 0.546 2.119 Age_Educ.31_55_High -1.223 4.078 3.266 -1.101 -0.701 Age_Educ.>55_Low 5.187 -2.182 -2.503 2.298 -3.529 Age_Educ.>55_Medium -1.967 1.006 -1.377 2.599 -2.139 Age_Educ.>55_High -1.533 3.290 -0.398 3.421 1.935 Gender_Educ.M_EduLow 1.486 -5.428 0.152 1.131 -1.886 Gender_Educ.M_EduMed -0.520 1.948 0.680 -3.572 0.482 Gender_Educ.M_EduHigh -1.538 4.601 2.633 -0.607 2.116 Gender_Educ.W_EduLow 2.312 -1.829 -2.471 2.281 -0.874 Gender_Educ.W_EduMed -2.994 0.282 -1.164 0.300 2.102 Gender_Educ.W_EduHigh 0.849 3.785 2.227 -0.210 -1.760 |
Answer clustering
Selection how many axes to keep from reading the metawords/metakeys issued form the former CA
1 2 |
res.LexCA<-LexCA(res.TD, graph=FALSE,ncp=50,lmd=2,lmw=2) summary(res.LexCA,ndoc=0,nword=0,nsup=0,metaWords=TRUE,ncp=50) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 |
Correspondence analysis summary Eigenvalues Variance % of var. Cumulative % of var. dim 1 0.393 2.604 2.604 dim 2 0.376 2.487 5.092 dim 3 0.339 2.248 7.339 dim 4 0.334 2.215 9.554 dim 5 0.326 2.162 11.716 dim 6 0.311 2.062 13.778 dim 7 0.309 2.043 15.821 dim 8 0.299 1.978 17.799 dim 9 0.291 1.929 19.727 dim 10 0.288 1.906 21.634 dim 11 0.282 1.870 23.504 dim 12 0.278 1.840 25.344 dim 13 0.272 1.799 27.143 dim 14 0.269 1.778 28.921 dim 15 0.264 1.749 30.670 dim 16 0.261 1.725 32.396 dim 17 0.256 1.697 34.093 dim 18 0.253 1.676 35.769 dim 19 0.250 1.657 37.426 dim 20 0.248 1.640 39.066 dim 21 0.244 1.617 40.683 dim 22 0.241 1.594 42.277 dim 23 0.239 1.585 43.862 dim 24 0.235 1.558 45.419 dim 25 0.228 1.513 46.932 dim 26 0.228 1.510 48.442 dim 27 0.224 1.483 49.925 dim 28 0.220 1.459 51.384 dim 29 0.216 1.428 52.811 dim 30 0.214 1.420 54.232 dim 31 0.211 1.400 55.631 dim 32 0.207 1.373 57.004 dim 33 0.205 1.360 58.364 dim 34 0.199 1.318 59.682 dim 35 0.196 1.299 60.981 dim 36 0.196 1.295 62.276 dim 37 0.194 1.282 63.558 dim 38 0.190 1.258 64.816 dim 39 0.189 1.251 66.067 dim 40 0.186 1.232 67.299 dim 41 0.184 1.220 68.520 dim 42 0.183 1.211 69.731 dim 43 0.178 1.181 70.912 dim 44 0.176 1.168 72.080 dim 45 0.173 1.147 73.226 dim 46 0.171 1.132 74.358 dim 47 0.167 1.103 75.461 dim 48 0.166 1.099 76.560 dim 49 0.164 1.083 77.644 dim 50 0.161 1.068 78.712 Cramer's V 0.443 Inertia 15.101 Words whose contribution is over 2 times the average word contribution Dimension 1 + people healthy want like see go happy getting live Dimension 1 - health happiness family Dimension 2 + peace mind love freedom want people Dimension 2 - healthy Dimension 3 + time leisure work dog Dimension 3 - enough money happiness live comfortably healthy Dimension 4 + living peace mind standard wife children Dimension 4 - money leisure time enough work job Dimension 5 + mind peace leisure living work standard time Dimension 5 - love wife music daughter Dimension 6 + living standard go church education freedom Dimension 6 - son happy daughter see mind Dimension 7 + love church friends people happy others Dimension 7 - freedom wife want Dimension 8 + keeping job house suppose contentment well getting Dimension 8 - church happy others Dimension 9 + life social others future keeping help Dimension 9 - living suppose home standard husband nice content Dimension 10 + freedom healthy friends personal want really music just Dimension 10 - food people children grandchildren getting keeping Dimension 11 + love healthy security time standard leisure Dimension 11 - church suppose going friends Dimension 12 + welfare wellbeing wife contentment see people living Dimension 12 - husband freedom happy home social want Dimension 13 + keep way love friends future able keeping church well Dimension 13 - getting social people happy Dimension 14 + healthy church keeping employment time getting leisure going Dimension 14 - work dog love able social Dimension 15 + suppose really important happiness much music Dimension 15 - wellbeing church content house work general Dimension 16 + keeping standard freedom house content everything Dimension 16 - suppose children home dog husband able live Dimension 17 + music daughter church social suppose peace love Dimension 17 - children freedom keep able Dimension 18 + see education like Dimension 18 - really suppose wife just happy much Dimension 19 + way wellbeing nice important home friends people job Dimension 19 - church keep go going security healthy Dimension 20 + kids important going mind education much really long Dimension 20 - suppose son world Dimension 21 + daughter dog get going time others Dimension 21 - healthy work friends future Dimension 22 + contentment education future content long children see suppose Dimension 22 - keep friends healthy dog wife Dimension 23 + wellbeing social education suppose general Dimension 23 - satisfaction others music happy house personal Dimension 24 + daughter husband son work Dimension 24 - dog wellbeing music content like employment Dimension 25 + content son job money good Dimension 25 - healthy live family wellbeing work husband see Dimension 26 + dog happiness contentment way world church food nice personal just Dimension 26 - welfare music enjoy Dimension 27 + music getting son everything go Dimension 27 - wife freedom job suppose Dimension 28 + job wellbeing world getting things going long Dimension 28 - food contentment friends education well keeping Dimension 29 + music future son food world keep make Dimension 29 - employment getting wife important able friends worries daughter Dimension 30 + education kids daughter employment house help others happiness Dimension 30 - contentment job good see Dimension 31 + nice go holidays wellbeing enjoy wife able Dimension 31 - dog welfare employment live long husband Dimension 32 + food content house Dimension 32 - worries grandchildren money welfare family Dimension 33 + house children personal much nice social content Dimension 33 - go dog just important kids son Dimension 34 + way holidays house security content daughter Dimension 34 - employment wellbeing world happy personal Dimension 35 + going son keep money enjoy important Dimension 35 - able good get house job daughter world personal Dimension 36 + going healthy son everything education world car Dimension 36 - kids grandchildren music well Dimension 37 + worries get suppose way going happy children security everything Dimension 37 - like grandchildren personal good Dimension 38 + worries everything welfare security want grandchildren suppose others wellbeing Dimension 38 - holidays get happy employment Dimension 39 + grandchildren world house employment important satisfaction Dimension 39 - security really wife getting people future Dimension 40 + important content comfortable well much Dimension 40 - contentment husband really kids employment Dimension 41 + grandchildren going friends music food long like Dimension 41 - car see church way son welfare house Dimension 42 + enjoy food worries friends dog much Dimension 42 - nice just car want keep Dimension 43 + everything home contentment music dog Dimension 43 - holidays go welfare going just Dimension 44 + want others getting car dog friends money contentment Dimension 44 - just like home family Dimension 45 + security employment enjoy important world Dimension 45 - happiness welfare everything Dimension 46 + people long get husband contentment important music Dimension 46 - welfare enjoy future food see just getting grandchildren Dimension 47 + long way food home world holidays Dimension 47 - future people grandchildren Dimension 48 + car make welfare friends son security Dimension 48 - want people worries see Dimension 49 + grandchildren enjoy home content employment help food long satisfaction Dimension 49 - happy house education world things time Dimension 50 + holidays just worries like enjoy children good general Dimension 50 - home way happy comfortable go |
6 axes have to be kept. CA has to be runned again, to keep only these 6 axes
1 2 3 4 |
res.TD<-TextData(base_UK,var.text=c(9,10), stop.word.tm=TRUE,stop.word.user=stopwu,Dmin=15, context.quali=c("Gender","Education","Age_Educ","Gender_Educ"), context.quanti=c("Age"),graph=FALSE) summary(res.TD) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
TextData summary Before After Documents 1043.00 1039.00 Occurrences 13917.00 5026.00 Words 1334.00 78.00 Mean-length 13.34 4.84 NonEmpty.Docs 1040.00 1039.00 NonEmpty.Mean-length 13.38 4.84 Statistics for the 10 first docs DocName Occurrences DistinctWords PctLength Mean Length100 Occurrences DistinctWords PctLength Mean Length100 before before before before after after after after 1 256 73 65 0.52 547.09 20 19 0.40 415.04 2 571 49 32 0.35 367.23 17 13 0.34 352.79 3 122 73 59 0.52 547.09 16 14 0.32 332.03 4 754 41 28 0.29 307.27 16 12 0.32 332.03 5 563 31 26 0.22 232.33 14 13 0.28 290.53 6 43 57 48 0.41 427.18 13 10 0.26 269.78 7 275 54 43 0.39 404.70 13 11 0.26 269.78 8 455 62 39 0.45 464.65 13 8 0.26 269.78 9 519 44 34 0.32 329.75 13 12 0.26 269.78 10 562 34 29 0.24 254.81 13 13 0.26 269.78 Index of the 50 most frequent words Word Frequency N.Documents 1 family 705 624 2 health 612 555 3 good 303 254 4 happiness 229 218 5 money 172 169 6 life 161 149 7 job 143 137 8 happy 137 123 9 children 131 129 10 work 118 113 11 friends 116 111 12 husband 96 92 13 home 90 89 14 live 84 76 15 peace 79 75 16 wife 76 72 17 living 68 66 18 enough 68 61 19 people 63 55 20 able 56 51 21 get 54 51 22 just 52 52 23 keep 50 39 24 mind 47 46 25 healthy 46 45 26 well 43 40 27 security 40 40 28 like 40 35 29 freedom 38 35 30 keeping 37 29 31 getting 35 32 32 house 34 33 33 standard 33 32 34 love 33 26 35 contentment 31 30 36 time 31 29 37 want 31 25 38 grandchildren 30 29 39 nice 30 26 40 world 29 29 41 really 28 28 42 things 28 25 43 son 27 25 44 important 26 26 45 going 26 25 46 see 26 22 47 general 25 25 48 education 25 24 49 food 23 22 50 suppose 23 20 Summary of the contextual categorical variables Gender Education Age_Educ Gender_Educ Man :494 E_Low :478 >55_Low :235 M_EduLow :221 Woman:545 E_Medium:418 31_55_Low :225 M_EduMed :197 E_High :143 <=30_Medium :187 M_EduHigh: 76 31_55_Medium:159 W_EduLow :257 >55_Medium : 72 W_EduMed :221 <=30_High : 61 W_EduHigh: 67 (Other) :100 Summary of the contextual quantitative variables Age Min. :18.00 1st Qu.:30.00 Median :44.00 Mean :45.82 3rd Qu.:61.00 Max. :90.00 |
1 2 |
res.LexCA<-LexCA(res.TD, graph=FALSE,ncp=6,context.sup="ALL") summary(res.LexCA) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
Correspondence analysis summary Eigenvalues Variance % of var. Cumulative % of var. dim 1 0.393 2.604 2.604 dim 2 0.376 2.487 5.092 dim 3 0.339 2.248 7.339 dim 4 0.334 2.215 9.554 dim 5 0.326 2.162 11.716 Cramer's V 0.443 Inertia 15.101 DOCUMENTS Coordinates Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 1 -0.950 0.098 -0.637 -0.118 -0.298 2 0.503 0.324 -0.099 0.200 -0.103 3 -1.222 0.259 -0.337 0.216 -0.024 4 -0.706 0.105 -0.685 -0.528 -0.108 5 1.150 -0.686 -1.795 -0.590 0.622 6 0.472 -0.038 0.835 0.511 -1.183 7 -1.154 0.241 -0.789 -0.306 -0.407 8 1.456 -0.726 -0.799 -0.071 0.351 9 0.547 0.074 -0.137 -0.837 -0.335 10 0.073 -0.586 0.475 1.065 -1.090 Contributions (by column total=100) Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 1 0.137 0.002 0.071 0.002 0.016 2 0.077 0.033 0.003 0.014 0.004 3 0.151 0.007 0.013 0.006 0.000 4 0.101 0.002 0.110 0.066 0.003 5 0.402 0.149 1.133 0.124 0.141 6 0.068 0.000 0.245 0.093 0.511 7 0.135 0.006 0.073 0.011 0.020 8 0.858 0.223 0.300 0.002 0.060 9 0.076 0.001 0.006 0.209 0.034 10 0.001 0.073 0.053 0.270 0.290 Square cosinus (by row total=1) Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 1 0.215 0.002 0.097 0.003 0.021 2 0.020 0.008 0.001 0.003 0.001 3 0.036 0.002 0.003 0.001 0.000 4 0.158 0.004 0.149 0.088 0.004 5 0.096 0.034 0.234 0.025 0.028 6 0.011 0.000 0.033 0.012 0.067 7 0.204 0.009 0.095 0.014 0.025 8 0.136 0.034 0.041 0.000 0.008 9 0.008 0.000 0.001 0.019 0.003 10 0.000 0.011 0.007 0.036 0.037 Inertia 1 0.003 2 0.015 3 0.016 4 0.003 5 0.016 6 0.025 7 0.003 8 0.025 9 0.036 10 0.025 WORDS Coordinates Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 able 0.931 -0.076 -0.527 -0.449 0.080 car 0.848 0.104 0.516 -1.098 0.178 children 0.073 -0.597 0.211 0.802 -0.509 church 0.604 0.738 0.395 -1.153 -1.567 comfortable 0.009 -0.378 0.108 0.572 0.307 comfortably 0.895 -0.753 -2.065 -1.032 0.899 content 0.768 -0.170 -0.499 0.470 -0.024 contentment -1.180 0.329 -0.518 0.398 -0.006 daughter 0.642 -0.880 0.347 1.357 -1.750 dog 0.523 -1.072 1.695 0.363 0.111 Contributions (by-column total=100) Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 able 2.456 0.017 0.913 0.673 0.022 car 0.655 0.010 0.281 1.292 0.035 children 0.035 2.475 0.341 5.017 2.066 church 0.295 0.462 0.146 1.264 2.395 comfortable 0.000 0.151 0.014 0.389 0.115 comfortably 0.689 0.511 4.249 1.078 0.836 content 0.537 0.027 0.262 0.236 0.001 contentment 2.183 0.178 0.488 0.292 0.000 daughter 0.438 0.862 0.148 2.300 3.917 dog 0.221 0.973 2.694 0.125 0.012 Square cosinus (by-row total=1) Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 able 0.052 0.000 0.017 0.012 0.000 car 0.015 0.000 0.006 0.025 0.001 children 0.001 0.048 0.006 0.086 0.035 church 0.005 0.007 0.002 0.017 0.031 comfortable 0.000 0.004 0.000 0.009 0.002 comfortably 0.018 0.013 0.095 0.024 0.018 content 0.010 0.000 0.004 0.004 0.000 contentment 0.040 0.003 0.008 0.005 0.000 daughter 0.008 0.015 0.002 0.035 0.058 dog 0.004 0.016 0.041 0.002 0.000 Inertia able 0.186 car 0.173 children 0.195 church 0.252 comfortable 0.151 comfortably 0.152 content 0.210 contentment 0.215 daughter 0.219 dog 0.222 SUPPLEMENTARY CATEGORIES Coordinates Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 Gender.Man 0.000 -0.009 0.031 -0.033 -0.001 Gender.Woman 0.000 0.007 -0.026 0.028 0.001 Education.E_Low 0.049 -0.091 -0.031 0.044 -0.035 Education.E_Medium -0.051 0.031 -0.008 -0.044 0.037 Education.E_High -0.018 0.216 0.125 -0.021 0.010 Age_Educ.<=30_Low 0.054 -0.212 -0.033 -0.008 0.369 Age_Educ.<=30_Medium -0.004 -0.049 0.006 -0.172 0.064 Age_Educ.<=30_High 0.081 0.169 0.126 -0.106 -0.004 Age_Educ.31_55_Low -0.041 -0.119 0.005 0.032 -0.006 Age_Educ.31_55_Medium -0.078 0.103 0.003 0.017 0.067 Square cosinus Only the first 10 elements are shown Dim 1 Dim 2 Dim 3 Dim 4 Dim 5 Gender.Man 0.000 0.001 0.015 0.018 0.000 Gender.Woman 0.000 0.001 0.015 0.018 0.000 Education.E_Low 0.011 0.039 0.005 0.009 0.006 Education.E_Medium 0.052 0.019 0.001 0.040 0.028 Education.E_High 0.009 1.355 0.455 0.013 0.003 Age_Educ.<=30_Low 0.006 0.087 0.002 0.000 0.265 Age_Educ.<=30_Medium 0.000 0.002 0.000 0.027 0.004 Age_Educ.<=30_High 0.050 0.216 0.120 0.085 0.000 Age_Educ.31_55_Low 0.002 0.018 0.000 0.001 0.000 Age_Educ.31_55_Medium 0.040 0.069 0.000 0.002 0.030 v. |