INDEX
Explanations
references to childhood experiences
New Auto-Interp
Negative Logits
Soon
-0.16
ós
-0.14
ihar
-0.14
Soon
-0.14
ws
-0.14
alley
-0.14
à¹īำหà¸Ļ
-0.13
èī¯
-0.13
Prototype
-0.13
erras
-0.13
POSITIVE LOGITS
little
0.30
younger
0.29
small
0.27
young
0.26
smaller
0.24
little
0.24
kid
0.22
tiny
0.21
peque
0.21
small
0.20
Activations Density 0.058%