INDEX
Explanations
references to geometric shapes, specifically squares in various contexts
New Auto-Interp
Negative Logits
chers
-0.17
jury
-0.17
essler
-0.16
amines
-0.16
Wayback
-0.15
uen
-0.15
PLIED
-0.15
cher
-0.14
trÃŃ
-0.14
squared
-0.14
POSITIVE LOGITS
footage
0.23
-root
0.22
peg
0.21
æł¹
0.21
/oct
0.20
pants
0.20
ovel
0.20
foot
0.20
-shaped
0.19
-corner
0.19
Activations Density 0.025%