INDEX
Explanations
the word "but" in various contexts
New Auto-Interp
Negative Logits
osu
-0.16
سÙĥ
-0.14
uire
-0.14
mour
-0.14
iami
-0.13
uces
-0.13
ença
-0.13
Trick
-0.13
oldem
-0.13
uxe
-0.13
POSITIVE LOGITS
/or
0.16
ommen
0.14
anes
0.14
ect
0.14
tems
0.14
irl
0.14
/OR
0.14
ugg
0.13
chers
0.13
ernel
0.13
Activations Density 0.072%