INDEX
Explanations
references to historical religious figures and events
New Auto-Interp
Negative Logits
977
-0.20
obao
-0.15
989
-0.15
_mk
-0.15
sat
-0.15
eba
-0.14
coe
-0.14
opp
-0.14
proc
-0.14
iona
-0.14
POSITIVE LOGITS
Bing
0.17
_builtin
0.16
movable
0.15
lit
0.14
ateurs
0.14
rna
0.14
unge
0.14
merce
0.14
Colleg
0.14
mov
0.14
Activations Density 0.084%