INDEX
Negative Logits
0
0.85
После
0.73
嵬
0.73
0
0.70
История
0.68
行う
0.67
к
0.67
楎
0.65
Ⴚ
0.65
políticos
0.64
POSITIVE LOGITS
boasted
0.93
’
0.84
boasts
0.81
t
0.78
j
0.78
bragging
0.77
brag
0.73
previews
0.73
ट
0.69
SE
0.68
Activations Density 0.002%