INDEX
Explanations
references to Jewish religious figures and texts
mentions of religious figures and texts, particularly those associated with Judaism
New Auto-Interp
Negative Logits
swick
-0.92
anwhile
-0.75
Gators
-0.72
auga
-0.71
inho
-0.71
iago
-0.70
mble
-0.69
mson
-0.69
bodied
-0.68
duct
-0.68
POSITIVE LOGITS
׾
1.13
ש
1.12
×
1.11
×ķ
1.11
×Ļ
1.10
×
1.10
×Ļ×
1.09
ת
1.03
ר
1.02
×Ķ
0.99
Activations Density 0.080%