INDEX
Explanations
the word "boys" in various contexts
references to boys and gender-related discussions
New Auto-Interp
Negative Logits
arily
-0.85
lyak
-0.78
itures
-0.73
osi
-0.67
showc
-0.67
aries
-0.67
Machina
-0.66
olia
-0.66
mediated
-0.65
aeda
-0.64
POSITIVE LOGITS
ages
1.07
friend
0.98
puberty
0.95
boys
0.91
hood
0.90
Scouts
0.89
boys
0.88
scout
0.84
scouts
0.82
aged
0.79
Activations Density 0.048%