INDEX
Explanations
references to assessment tools and measurement criteria
New Auto-Interp
Negative Logits
æ´ĭ
-0.17
hil
-0.15
proof
-0.15
Proof
-0.15
å½¹
-0.14
chts
-0.14
/mac
-0.14
laid
-0.14
abit
-0.14
cur
-0.14
POSITIVE LOGITS
ebek
0.17
Sexy
0.15
eld
0.15
ivative
0.15
ÙĪØº
0.14
upakan
0.14
elden
0.14
aper
0.14
erate
0.14
esi
0.14
Activations Density 0.027%