INDEX
Negative Logits
дон
0.54
DRA
0.45
code
0.44
Code
0.44
f
0.43
нями
0.42
zá
0.42
éraires
0.41
вести
0.40
Thành
0.40
POSITIVE LOGITS
ρ
0.58
μον
0.54
κ
0.52
increment
0.52
attrition
0.52
γλώ
0.50
sufficiency
0.50
rinsing
0.49
μην
0.49
wyja
0.49
Activations Density 0.000%