INDEX
Negative Logits
HStack
-0.93
being
-0.88
اطر
-0.86
することで
-0.84
the
-0.84
whose
-0.82
America
-0.82
which
-0.82
رده
-0.82
Dalam
-0.81
POSITIVE LOGITS
whenever
1.34
and
1.22
whenever
1.05
holds
1.02
holds
0.98
ldorf
0.91
Whenever
0.91
рым
0.86
idenav
0.81
ston
0.79
Activations Density 0.046%