INDEX
Explanations
terms related to rankings or scores in competitive settings
New Auto-Interp
Negative Logits
[s
-0.17
enas
-0.16
dma
-0.15
ниÑĤ
-0.15
(s
-0.14
ÙijÙĦ
-0.14
rias
-0.14
alla
-0.14
mand
-0.14
ipa
-0.14
POSITIVE LOGITS
ATO
0.16
signature
0.15
iday
0.14
Rim
0.14
ydk
0.14
Ùĩ
0.14
ÏĤ
0.14
çĨŁ
0.14
fol
0.14
erotici
0.14
Activations Density 0.290%