INDEX
Explanations
references to religious teachings and beliefs
New Auto-Interp
Negative Logits
perty
-0.15
idon
-0.15
illard
-0.14
ovich
-0.14
Reputation
-0.14
estate
-0.14
abant
-0.14
stellen
-0.14
displayName
-0.13
ark
-0.13
POSITIVE LOGITS
rel
0.54
religion
0.52
Rel
0.50
religions
0.49
Rel
0.47
REL
0.47
-rel
0.47
Religion
0.45
REL
0.44
rel
0.39
Activations Density 0.230%