INDEX
Explanations
references to different states or conditions
New Auto-Interp
Negative Logits
hey
-0.21
thon
-0.17
sak
-0.17
them
-0.17
_states
-0.17
sov
-0.17
los
-0.17
hest
-0.17
statements
-0.16
nop
-0.16
POSITIVE LOGITS
craft
0.37
hood
0.35
-of
0.28
wide
0.28
ful
0.26
fully
0.26
house
0.24
Unidos
0.24
holder
0.23
manship
0.23
Activations Density 0.104%