INDEX
Explanations
references to religious figures, specifically Rabbis
references to religious figures and the Torah
New Auto-Interp
Negative Logits
mson
-0.94
swick
-0.82
iago
-0.81
bodied
-0.81
auga
-0.77
Hurricanes
-0.77
orph
-0.73
antle
-0.72
inho
-0.70
fight
-0.68
POSITIVE LOGITS
rabbi
1.16
Rabbi
1.15
rabb
1.10
Torah
1.03
anyahu
1.00
Judaism
0.99
ת
0.96
×ķ
0.89
Netanyahu
0.88
׾
0.88
Activations Density 0.018%