INDEX
Explanations
declarative sentences with neutrality
the word "was" in various contexts
New Auto-Interp
Negative Logits
sburg
-0.92
uld
-0.74
iny
-0.70
Spani
-0.70
olon
-0.66
bury
-0.65
inas
-0.65
usercontent
-0.65
know
-0.65
peg
-0.64
POSITIVE LOGITS
widely
1.03
designed
0.99
intended
0.98
aimed
0.96
slated
0.94
meant
0.93
billed
0.93
awarded
0.92
prompted
0.92
housed
0.92
Activations Density 0.299%