INDEX
Explanations
references to personal identity or self-description
"a" followed by an adjective
a [descriptor]
New Auto-Interp
Negative Logits
démocr
-0.70
Púb
-0.68
educativas
-0.66
educativos
-0.65
koľvek
-0.63
motivadoras
-0.61
natale
-0.60
rerum
-0.60
maschile
-0.60
liknande
-0.59
POSITIVE LOGITS
optim
0.67
college
0.60
human
0.58
professional
0.54
proud
0.53
Hoek
0.52
cyn
0.51
hardcore
0.51
true
0.50
ciaio
0.50
Activations Density 0.302%