INDEX
Negative Logits
unlocked
-0.08
dependence
-0.07
locking
-0.07
saanut
-0.07
provoca
-0.07
มาก
-0.07
">@
-0.07
't
-0.07
той
-0.07
endees
-0.07
POSITIVE LOGITS
ūd
0.09
cancelling
0.08
ijiet
0.08
faculdade
0.08
pudding
0.08
法
0.08
softened
0.08
nosť
0.08
مدرس
0.08
Chest
0.08
Activations Density 0.004%