INDEX
Explanations
references to government programs and budgetary issues
New Auto-Interp
Negative Logits
EATURE
-0.16
usz
-0.16
é¡¿
-0.15
arton
-0.15
оÑĤÑı
-0.15
raison
-0.15
itched
-0.14
swick
-0.14
orde
-0.14
ÄĽl
-0.14
POSITIVE LOGITS
our
0.23
unser
0.19
ourselves
0.18
America
0.17
commons
0.17
protections
0.17
bru
0.16
âĢİ
0.16
une
0.16
ours
0.15
Activations Density 0.495%