INDEX
Explanations
sentences that involve conditional phrases or alternatives
New Auto-Interp
Negative Logits
either
-0.19
empo
-0.18
Either
-0.16
entonces
-0.16
Either
-0.15
både
-0.15
ãĤĦ
-0.15
wiÄĻc
-0.15
EITHER
-0.15
gether
-0.14
POSITIVE LOGITS
else
0.55
phans
0.52
ients
0.48
ifice
0.43
even
0.42
otherwise
0.40
acles
0.39
acular
0.39
worse
0.38
lando
0.37
Activations Density 0.370%