INDEX
Explanations
instances of the word "no"
New Auto-Interp
Negative Logits
iek
-0.15
erm
-0.14
imm
-0.14
ùa
-0.14
inker
-0.14
_shortcode
-0.14
Lam
-0.14
inja
-0.13
gold
-0.13
Laf
-0.13
POSITIVE LOGITS
-pills
0.16
jsp
0.15
Äĥr
0.15
uese
0.15
akens
0.14
iazza
0.14
0.14
rray
0.14
Pilot
0.14
igm
0.13
Activations Density 0.060%