INDEX
Negative Logits
pretended
-0.77
preferring
-0.71
Acting
-0.70
IContainer
-0.69
pretending
-0.68
seems
-0.68
refusing
-0.65
Acting
-0.65
seeming
-0.65
ьаж
-0.65
POSITIVE LOGITS
to
0.92
ly
0.85
LY
0.71
]='\
0.59
Werbung
0.56
zunehmen
0.53
une
0.50
nesses
0.50
expandindo
0.50
schaft
0.50
Activations Density 0.328%