INDEX
Explanations
specific names and places related to personal relationships and public figures
New Auto-Interp
Negative Logits
ccb
-0.14
egot
-0.13
dea
-0.13
оÑĢе
-0.13
rire
-0.13
ugo
-0.12
-archive
-0.12
icket
-0.12
ukkit
-0.12
ÑĢоÑģÑĤо
-0.12
POSITIVE LOGITS
idel
0.13
což
0.13
intro
0.12
oret
0.12
followed
0.12
ynes
0.12
gressor
0.12
chop
0.12
)?.
0.12
indre
0.11
Activations Density 1.136%