INDEX
Explanations
punctuations and symbols
New Auto-Interp
Negative Logits
ekl
-0.15
tug
-0.14
aff
-0.14
asu
-0.14
еÑĢ
-0.14
пеÑĩ
-0.14
632
-0.13
æĦ
-0.13
sti
-0.13
armor
-0.13
POSITIVE LOGITS
UY
0.16
enne
0.16
Interval
0.15
avr
0.15
wick
0.15
Interval
0.14
977
0.14
an
0.14
ROL
0.14
voke
0.14
Activations Density 0.014%