INDEX
Explanations
occurrences of the word "else."
phrases that emphasize alternatives or other options
New Auto-Interp
Negative Logits
anon
-0.68
Telescope
-0.66
haw
-0.62
omorph
-0.62
ulas
-0.61
iewicz
-0.61
urger
-0.61
uto
-0.60
chen
-0.58
ASED
-0.58
POSITIVE LOGITS
else
1.05
swer
0.91
Else
0.90
else
0.89
worldly
0.87
icter
0.87
describ
0.85
behavi
0.83
mosqu
0.78
Else
0.77
Activations Density 0.014%