INDEX
Explanations
variations of punctuation and periods, particularly at the end of sentences or phrases
New Auto-Interp
Negative Logits
undert
-0.18
lifetime
-0.16
orage
-0.16
Rus
-0.15
roma
-0.15
Lifetime
-0.14
Lifetime
-0.14
agra
-0.14
QR
-0.14
ÏīÏĤ
-0.13
POSITIVE LOGITS
zych
0.20
neob
0.16
edics
0.15
agnostic
0.15
رد
0.14
Lesser
0.14
linky
0.14
ValueChanged
0.14
itzer
0.14
HECK
0.14
Activations Density 0.155%