INDEX
Explanations
mentions of atheism and associated beliefs
New Auto-Interp
Negative Logits
ulkan
-0.16
immel
-0.15
zen
-0.14
Faster
-0.14
omez
-0.14
ternet
-0.14
ovit
-0.14
emme
-0.13
omain
-0.13
istrar
-0.13
POSITIVE LOGITS
berger
0.14
rieg
0.14
frey
0.14
äd
0.14
Ferm
0.13
cop
0.13
upo
0.13
goto
0.13
agues
0.13
cuff
0.13
Activations Density 0.117%