INDEX
Negative Logits
Gemeinsame
-1.01
GREY
-0.91
⤞
-0.90
сного
-0.87
letti
-0.84
UNIDENTIFIED
-0.83
engels
-0.83
tyd
-0.82
pence
-0.80
ặt
-0.79
POSITIVE LOGITS
conducted
0.77
wretched
0.72
bună
0.71
espesor
0.71
合う
0.71
experi
0.71
brica
0.71
どこに
0.68
ligera
0.68
foque
0.67
Activations Density 0.002%