INDEX
Explanations
proper nouns of people
the word "who" in various contexts
New Auto-Interp
Negative Logits
³³³³
-0.77
BACK
-0.72
Bound
-0.68
Dos
-0.64
Glob
-0.62
MER
-0.61
Situation
-0.61
Georg
-0.61
Processing
-0.60
Due
-0.60
POSITIVE LOGITS
soever
1.21
oping
0.96
oped
0.88
ever
0.86
oversaw
0.86
accompanies
0.85
attended
0.85
preceded
0.85
specialize
0.80
resided
0.80
Activations Density 0.181%