INDEX
Negative Logits
be
0.40
lobbying
0.38
repatri
0.37
disparity
0.37
advocates
0.36
corporate
0.36
organization
0.35
rishna
0.35
greater
0.35
enter
0.35
POSITIVE LOGITS
0.42
ट्वी
0.41
않을
0.40
чуде
0.40
않았
0.39
MDET
0.38
ጌ
0.37
evanes
0.37
Дон
0.36
⼯
0.36
Activations Density 0.004%