INDEX
Explanations
references to toys or brand names associated with toys
New Auto-Interp
Negative Logits
skirts
-0.18
ñana
-0.16
icks
-0.16
erator
-0.16
sdk
-0.15
icast
-0.15
GAN
-0.15
129
-0.15
еÑģÑı
-0.14
yon
-0.14
POSITIVE LOGITS
ota
0.27
etic
0.24
otas
0.23
Soldiers
0.20
Story
0.20
OTA
0.19
soldiers
0.19
chest
0.19
Soldier
0.18
Chest
0.18
Activations Density 0.007%