INDEX
Negative Logits
retweet
0.76
gib
0.75
Arial
0.74
orro
0.72
vii
0.71
আমি
0.70
elytra
0.69
ulty
0.68
ˇ
0.68
analy
0.67
POSITIVE LOGITS
ować
0.78
是不
0.71
blancas
0.70
sejahtera
0.70
nantinya
0.70
જીવન
0.69
puede
0.68
ς
0.67
acompañ
0.66
収入
0.66
Activations Density 0.001%