INDEX
Negative Logits
±ظ
-0.07
Sharia
-0.07
Sır
-0.07
Hannity
-0.06
kurulan
-0.06
.paper
-0.06
Kadın
-0.06
ego
-0.06
FR
-0.06
Ả
-0.06
POSITIVE LOGITS
"][
0.07
devoted
0.06
obl
0.06
.↵↵
0.06
extra
0.06
believes
0.06
Exercise
0.06
restart
0.06
。↵↵
0.06
LEM
0.06
Activations Density 0.007%