INDEX
Explanations
mentions of a specific term "Boy" within the text
references to the term "Boy" in various contexts
New Auto-Interp
Negative Logits
aeda
-1.01
aukee
-0.75
podium
-0.73
Ͻ
-0.72
anwhile
-0.72
equilibrium
-0.70
totality
-0.70
instability
-0.69
incinn
-0.69
srf
-0.69
POSITIVE LOGITS
friend
1.39
boy
1.28
Boy
1.24
Boy
1.24
Scouts
1.20
cott
1.12
boys
1.10
Scout
1.02
hood
1.00
Girl
0.99
Activations Density 0.006%