INDEX
Explanations
people's names, particularly those with "Moh" or "Ramirez"
the repeated reference to the name "Mohammed."
New Auto-Interp
Negative Logits
istics
-0.98
istically
-0.95
istical
-0.75
balloons
-0.71
Stra
-0.66
laps
-0.65
istic
-0.65
ãĥ¯
-0.65
ificial
-0.65
charms
-0.63
POSITIVE LOGITS
awk
1.47
awks
1.38
sin
1.03
doms
1.02
ammad
0.97
atche
0.97
ammed
0.92
renheit
0.92
abit
0.90
ira
0.88
Activations Density 0.038%