INDEX
Explanations
descriptions of physical pain, physical sensations, and physical actions
New Auto-Interp
Negative Logits
tesy
-0.85
ady
-0.78
volunte
-0.72
itional
-0.72
conservancy
-0.71
itionally
-0.71
icip
-0.70
advoc
-0.70
oubted
-0.68
odox
-0.68
POSITIVE LOGITS
Whatever
1.14
Anything
1.14
Something
1.10
And
1.07
Everywhere
1.06
Anyway
1.04
Everything
1.03
Everybody
1.01
Nothing
1.00
Nobody
1.00
Activations Density 7.806%