INDEX
Explanations
numerical values related to data points
New Auto-Interp
Negative Logits
unm
-0.16
olang
-0.15
illi
-0.15
hints
-0.15
...
-0.15
941
-0.15
uke
-0.15
(
-0.14
urga
-0.14
contr
-0.13
POSITIVE LOGITS
ìłĪ
0.15
/tos
0.15
aeda
0.15
å¥
0.14
Cald
0.14
leftright
0.13
izin
0.13
ÙĪÙĤت
0.13
Smooth
0.13
otron
0.13
Activations Density 0.000%