INDEX
Explanations
instances of the word "even," indicating a focus on emphasizing capabilities or situations
New Auto-Interp
Negative Logits
either
-0.22
either
-0.21
Either
-0.20
Either
-0.19
tháºŃm
-0.18
EITHER
-0.17
également
-0.16
nawet
-0.15
neither
-0.15
anyway
-0.15
POSITIVE LOGITS
though
0.31
-handed
0.27
though
0.26
Though
0.25
worse
0.24
ness
0.23
Though
0.23
-number
0.21
sometimes
0.19
-more
0.18
Activations Density 0.062%