INDEX
Explanations
references to the word "kid" and its variations, often in contexts related to youth, identity, or storytelling
New Auto-Interp
Negative Logits
äd
-0.18
ibold
-0.18
ides
-0.18
ầng
-0.17
xis
-0.17
xad
-0.16
èĥİ
-0.15
utom
-0.15
eson
-0.15
aar
-0.15
POSITIVE LOGITS
ney
0.25
nap
0.25
lington
0.22
neys
0.19
der
0.18
ults
0.18
gloves
0.17
ronic
0.17
NEY
0.17
dee
0.16
Activations Density 0.013%