INDEX
Explanations
phrases related to historical military funding and armaments
occurrences of the word "The."
New Auto-Interp
Negative Logits
Allows
-0.70
#$
-0.69
.","
-0.68
ãĥĺ
-0.66
dk
-0.66
rade
-0.65
Ò
-0.65
ãĥ»
-0.64
""
-0.63
âĵĺ
-0.63
POSITIVE LOGITS
resa
1.41
odore
1.41
oret
1.41
Problem
1.29
downside
1.23
ories
1.19
irony
1.19
gist
1.18
Basics
1.17
problem
1.10
Activations Density 0.370%