INDEX
Explanations
references to religious themes or discussions
New Auto-Interp
Negative Logits
Seven
-0.25
ä¸ĥ
-0.23
fours
-0.22
02
-0.21
Nine
-0.21
åħ«
-0.21
nine
-0.20
åįģäºĮ
-0.20
Nine
-0.20
sixth
-0.20
POSITIVE LOGITS
0.17
autumn
0.16
mu
0.14
_
0.14
b
0.14
p
0.14
iled
0.14
r
0.14
iman
0.13
g
0.13
Activations Density 0.562%