INDEX
Explanations
mentions of the title "Ms." with high activation
instances of the title "Ms" followed by a name, indicating references to female individuals in a narrative
New Auto-Interp
Negative Logits
ãĥ³ãĤ¸
-0.65
annexed
-0.64
euth
-0.64
roundup
-0.63
Released
-0.62
SHIP
-0.61
shape
-0.61
ãĥ¼ãĥĨ
-0.61
CAST
-0.61
ranks
-0.61
POSITIVE LOGITS
gt
0.81
athed
0.80
ureen
0.74
mt
0.73
mia
0.71
esh
0.71
Xiang
0.67
Joyce
0.66
Ms
0.65
unia
0.65
Activations Density 0.008%