INDEX
Negative Logits
Rossi
-0.73
Effective
-0.62
Consent
-0.62
Wan
-0.61
Mehran
-0.58
Ples
-0.57
FU
-0.57
Dee
-0.56
primary
-0.56
RECT
-0.55
POSITIVE LOGITS
emouth
1.66
oir
1.20
ourn
1.09
ette
1.06
ettes
1.04
auts
0.97
nette
0.97
esses
0.96
oise
0.94
ments
0.93
Activations Density 0.012%