INDEX
Explanations
descriptions related to architectural features and structures
New Auto-Interp
Negative Logits
Ñıж
-0.15
otto
-0.15
orks
-0.15
Kendrick
-0.15
лав
-0.15
еÑı
-0.14
份
-0.14
Fork
-0.14
adies
-0.14
oyer
-0.14
POSITIVE LOGITS
surface
0.29
surfaces
0.26
Surface
0.24
surface
0.23
Surface
0.22
Exposed
0.19
urface
0.18
_surface
0.17
face
0.17
facing
0.17
Activations Density 0.176%