INDEX
Explanations
instances of the word "dude"
references to the word "dude" and its variations, indicating a focus on informal language and interactions among male characters
New Auto-Interp
Negative Logits
pring
-0.79
atories
-0.76
Integ
-0.71
ī
-0.68
ateg
-0.68
mberg
-0.67
":["
-0.66
HCR
-0.66
Cosponsors
-0.66
fman
-0.65
POSITIVE LOGITS
holes
1.01
dude
0.96
hole
0.95
ards
0.84
Dude
0.83
dudes
0.78
fuck
0.72
jeans
0.72
netflix
0.70
Beard
0.70
Activations Density 0.018%