INDEX
Negative Logits
OL
-0.08
ol
-0.07
090
-0.07
Bloom
-0.07
do
-0.07
Cool
-0.07
Wool
-0.07
04
-0.07
chool
-0.07
’ve
-0.07
POSITIVE LOGITS
against
0.21
Against
0.18
against
0.17
Against
0.14
tegen
0.09
против
0.08
atti
0.08
Anti
0.07
ضد
0.07
contra
0.07
Activations Density 0.022%