INDEX
Explanations
references to Christianity and related religious themes
New Auto-Interp
Negative Logits
extAlignment
-0.37
колба
-0.36
бы
-0.36
clocked
-0.35
crispy
-0.35
机
-0.34
ibrill
-0.34
πό
-0.33
idorm
-0.33
vacancy
-0.33
POSITIVE LOGITS
Christian
1.09
Chriftian
1.05
Christian
1.05
religion
0.98
christian
0.92
religious
0.89
Catholic
0.89
CHRISTIAN
0.85
christian
0.83
religions
0.82
Activations Density 0.094%