INDEX
Explanations
verbs indicating actions performed by individuals
phrases related to accountability and actions performed by individuals
New Auto-Interp
Negative Logits
Entered
-0.82
suspects
-0.68
Gork
-0.64
hog
-0.59
tur
-0.58
Det
-0.58
tongues
-0.57
tires
-0.56
drawn
-0.55
SIG
-0.55
POSITIVE LOGITS
pez
0.98
differently
0.83
wrong
0.78
女
0.76
ournal
0.73
administr
0.72
etting
0.72
unconsciously
0.69
hing
0.69
omnia
0.67
Activations Density 0.063%