INDEX
Explanations
phrases related to human characteristics and attributes such as physical appearance, abilities, and societal roles
instances of the verb "have" in various contexts
New Auto-Interp
Negative Logits
iaz
-0.68
ensing
-0.67
prosecuting
-0.65
oyer
-0.65
etting
-0.64
AU
-0.64
—-
-0.63
udo
-0.62
TG
-0.62
hooting
-0.60
POSITIVE LOGITS
been
1.26
undergone
1.20
been
1.13
gotten
0.98
suffered
0.93
lived
0.92
access
0.91
difficulty
0.90
Been
0.90
gone
0.89
Activations Density 0.335%