INDEX
Negative Logits
rocking
-0.07
ở
-0.07
enic
-0.07
height
-0.07
crud
-0.07
Cooper
-0.07
circulate
-0.07
broadcast
-0.07
Arb
-0.07
fo
-0.07
POSITIVE LOGITS
ward
0.11
most
0.10
wards
0.10
쪽
0.10
দিকে
0.09
-right
0.09
party
0.08
expos
0.08
kanan
0.08
翁
0.08
Activations Density 0.031%