INDEX
Explanations
the term "Phantom" or related variations in text
New Auto-Interp
Negative Logits
aug
-0.17
Crud
-0.16
edy
-0.15
uteur
-0.14
iddet
-0.14
ollo
-0.14
endra
-0.14
Č
-0.14
Mig
-0.13
WebResponse
-0.13
POSITIVE LOGITS
ouri
0.16
.li
0.15
Suarez
0.14
agon
0.14
ivre
0.14
unser
0.13
oser
0.13
Ñģебе
0.13
ynos
0.13
ours
0.13
Activations Density 0.004%