INDEX
Explanations
terms and phrases related to religious beliefs and authority
New Auto-Interp
Negative Logits
secular
-0.17
.constraints
-0.16
bose
-0.15
RSA
-0.15
leich
-0.15
bible
-0.15
religious
-0.14
/container
-0.14
Religious
-0.14
฿
-0.14
POSITIVE LOGITS
Nic
0.31
Creed
0.28
dog
0.28
Nice
0.27
teaching
0.26
Dog
0.25
cre
0.24
doctr
0.24
Cre
0.24
doctrine
0.24
Activations Density 0.076%