INDEX
Explanations
the word "else" as a focal point
instances of the phrase "what else."
New Auto-Interp
Negative Logits
acity
-0.64
©¶æ
-0.62
¶ħ
-0.61
Mehran
-0.60
Lenin
-0.60
ãĥī
-0.59
Abstract
-0.59
rought
-0.59
uay
-0.57
ousands
-0.57
POSITIVE LOGITS
worldly
1.27
besides
0.98
entirely
0.83
mattered
0.69
chy
0.68
ptive
0.68
arettes
0.67
nearby
0.66
.}
0.66
around
0.65
Activations Density 0.041%