INDEX
Explanations
instances of the word "one" and expressions indicating singular concepts or actions
New Auto-Interp
Negative Logits
reater
-0.17
putc
-0.16
ulu
-0.15
uluk
-0.15
specialchars
-0.15
istrovstvÃŃ
-0.15
icense
-0.15
vik
-0.14
ÐŁÐ¾Ñģ
-0.14
iveau
-0.14
POSITIVE LOGITS
assis
0.17
ZR
0.15
aza
0.14
ạc
0.14
undy
0.14
diplom
0.14
ë©´
0.14
ockey
0.14
unh
0.14
unny
0.13
Activations Density 0.014%