INDEX
Negative Logits
))));
-0.75
oid
-0.74
Transverse
-0.72
)));
-0.70
일에
-0.69
još
-0.68
iti
-0.68
elecciones
-0.67
forcefully
-0.66
이드
-0.66
POSITIVE LOGITS
ative
1.70
ativeness
1.70
atively
1.48
ATIVE
1.24
itive
1.16
tative
1.09
poles
0.93
ulative
0.90
ITIVE
0.90
tive
0.89
Activations Density 0.031%