INDEX
Explanations
mentions of churches
references to churches
New Auto-Interp
Negative Logits
neurot
-0.68
DonaldTrump
-0.67
Haf
-0.66
Yak
-0.66
à¸
-0.66
LER
-0.61
Rog
-0.61
Bagg
-0.60
berman
-0.60
nir
-0.60
POSITIVE LOGITS
goers
1.18
yard
1.10
esan
1.00
yards
0.90
church
0.90
choir
0.88
fathers
0.87
congregation
0.86
cha
0.85
church
0.81
Activations Density 0.019%