INDEX
Explanations
themes related to atheism and critiques of religious beliefs
New Auto-Interp
Negative Logits
egin
-0.15
dense
-0.15
rocket
-0.15
emark
-0.15
odb
-0.14
δÏİ
-0.14
oshi
-0.14
æĿ
-0.14
dm
-0.14
ystick
-0.14
POSITIVE LOGITS
sap
0.15
tolerance
0.15
toler
0.14
Antar
0.14
vis
0.14
_titles
0.14
cox
0.14
tember
0.14
bens
0.14
tooth
0.14
Activations Density 0.326%