INDEX
Explanations
statements expressing personal beliefs or expectations
expressions of beliefs or opinions from individuals
New Auto-Interp
Negative Logits
inary
-0.72
abi
-0.69
lite
-0.65
clock
-0.64
attle
-0.60
Written
-0.60
leness
-0.60
status
-0.59
commit
-0.58
Ware
-0.58
POSITIVE LOGITS
olate
0.68
olated
0.67
iewicz
0.59
sclerosis
0.58
revival
0.57
SUN
0.57
phas
0.56
terson
0.56
suing
0.56
aspers
0.56
Activations Density 0.210%