INDEX
Explanations
the word "whatever" in various contexts
New Auto-Interp
Negative Logits
inker
-0.18
ses
-0.17
ea
-0.16
yers
-0.16
hiba
-0.16
een
-0.16
hor
-0.16
mtree
-0.15
hes
-0.15
sj
-0.15
POSITIVE LOGITS
else
0.21
Æ¡
0.18
onne
0.16
anging
0.15
theless
0.15
dock
0.15
lli
0.14
igator
0.14
anged
0.14
ly
0.14
Activations Density 0.017%