INDEX
Explanations
references to specific toy collections and their characteristics
New Auto-Interp
Negative Logits
antz
-0.16
nackte
-0.14
ksi
-0.14
pone
-0.14
punkt
-0.13
xbd
-0.13
stanov
-0.13
_PARTITION
-0.13
env
-0.13
hea
-0.13
POSITIVE LOGITS
pose
0.16
Collect
0.16
Collect
0.16
collect
0.15
figure
0.15
him
0.15
slaught
0.15
íıī
0.14
acic
0.14
longleftrightarrow
0.14
Activations Density 0.006%