INDEX
Explanations
references to different states and their identification in a context related to governance or policy
New Auto-Interp
Negative Logits
thon
-0.21
hey
-0.17
luv
-0.16
teenth
-0.15
them
-0.15
meld
-0.15
ulk
-0.15
aber
-0.15
erd
-0.14
ne
-0.14
POSITIVE LOGITS
craft
0.25
wide
0.24
-of
0.21
hood
0.20
Unidos
0.19
ful
0.19
/local
0.19
-wide
0.18
/global
0.18
utory
0.17
Activations Density 0.078%