INDEX
Explanations
the word "either" and its variations in different contexts
New Auto-Interp
Negative Logits
vla
-0.16
vv
-0.15
vb
-0.14
rsa
-0.14
nes
-0.14
sci
-0.14
ssp
-0.14
ebo
-0.14
vak
-0.14
acco
-0.14
POSITIVE LOGITS
wel
0.21
phans
0.21
anged
0.19
theless
0.17
-than
0.17
EITHER
0.17
either
0.16
either
0.16
anges
0.15
angs
0.15
Activations Density 0.026%