INDEX
Negative Logits
Pradesh
-0.76
decomp
-0.76
Haram
-0.70
urses
-0.70
ctions
-0.64
bably
-0.63
ities
-0.60
FANT
-0.60
neglig
-0.60
chio
-0.60
POSITIVE LOGITS
er
1.37
erness
1.28
ers
1.14
ership
1.07
Twain
0.98
ipl
0.92
ing
0.90
ings
0.90
eer
0.90
ed
0.90
Activations Density 2.898%