INDEX
Negative Logits
ificent
0.52
தெரிவி
0.51
inairement
0.49
affirme
0.49
ially
0.49
telah
0.48
ariamente
0.48
בפ
0.46
recebeu
0.46
joins
0.46
POSITIVE LOGITS
ป
0.46
Му
0.45
newL
0.44
ኵ
0.44
undesired
0.43
사람
0.43
Romantic
0.43
حرام
0.42
items
0.42
Modulo
0.42
Activations Density 0.000%