INDEX
Explanations
the word "boy"
mentions of a boy in various contexts
New Auto-Interp
Negative Logits
aeda
-0.99
odium
-0.76
ãĥķãĤ©
-0.73
lyak
-0.70
phis
-0.69
adena
-0.68
Aff
-0.68
aukee
-0.67
showc
-0.67
topic
-0.66
POSITIVE LOGITS
hood
1.21
friend
1.20
Scouts
1.05
ishly
0.99
boys
0.98
nton
0.91
boys
0.90
boy
0.90
boy
0.89
scout
0.88
Activations Density 0.021%