INDEX
Explanations
references to a specific term, "Boy", often connected to names or groups, with varying levels of relevance indicated by the activation values
references to the word "Boy" and its variations
New Auto-Interp
Negative Logits
aeda
-1.16
aukee
-0.83
ickr
-0.77
podium
-0.77
iment
-0.72
encing
-0.70
insula
-0.69
mble
-0.68
instability
-0.67
ENC
-0.66
POSITIVE LOGITS
friend
1.47
cott
1.25
Scouts
1.20
hood
1.16
boys
1.15
boy
1.12
nton
1.08
friends
1.07
Scout
1.04
stown
0.99
Activations Density 0.012%