INDEX
Explanations
mentions of 'The Guy' or variations indicative of a characterization of a male figure
New Auto-Interp
Head Attr Weights
0:0.04
1:0.09
2:0.06
3:0.05
4:0.02
5:0.04
6:0.07
7:0.06
8:0.03
9:0.05
10:0.35
11:0.09
Negative Logits
orn
-2.99
ORN
-2.71
onga
-2.49
saf
-2.32
orns
-2.29
Pand
-2.24
orah
-2.23
ORT
-2.22
onduct
-2.17
ort
-2.15
POSITIVE LOGITS
guy
4.33
guys
3.91
Guys
3.64
dudes
3.57
dude
3.30
guy
3.28
Guy
3.12
Guy
2.77
folks
2.73
buddies
2.71
Activations Density 0.003%