INDEX
    Explanations

    descriptions related to architectural features and structures

    New Auto-Interp
    Negative Logits
    Ñıж
    -0.15
    otto
    -0.15
    orks
    -0.15
     Kendrick
    -0.15
    лав
    -0.15
    еÑı
    -0.14
    份
    -0.14
     Fork
    -0.14
    adies
    -0.14
    oyer
    -0.14
    POSITIVE LOGITS
     surface
    0.29
     surfaces
    0.26
     Surface
    0.24
    surface
    0.23
    Surface
    0.22
    Exposed
    0.19
    urface
    0.18
    _surface
    0.17
     face
    0.17
     facing
    0.17
    Act Density 0.176%

    No Known Activations