INDEX
Explanations
references to boys
references to boys and related contexts
New Auto-Interp
Negative Logits
itures
-0.87
iture
-0.83
mediated
-0.74
osi
-0.73
aries
-0.72
Accessory
-0.71
ary
-0.71
ation
-0.70
itect
-0.69
REC
-0.68
POSITIVE LOGITS
hift
0.99
Scouts
0.97
boys
0.93
friend
0.87
boys
0.79
ages
0.78
volent
0.77
friends
0.77
puberty
0.76
scouts
0.75
Activations Density 0.014%