INDEX
Negative Logits
Descriptor
-0.09
সাধ
-0.08
তিন
-0.08
_descriptor
-0.08
urdy
-0.08
descriptor
-0.08
पति
-0.08
vorhand
-0.08
আদাল
-0.08
Manipulator
-0.08
POSITIVE LOGITS
disproportionately
0.09
tariffs
0.08
.As
0.08
disproportion
0.08
उत
0.08
unnecessarily
0.08
Bush
0.07
Bush
0.07
greedy
0.07
quotas
0.07
Activations Density 0.014%