INDEX
    Explanations

    words that contain the substring "son"

    New Auto-Interp
    Negative Logits
    zers
    -0.07
    neys
    -0.07
     latter
    -0.07
    tuÄŁ
    -0.07
    zk
    -0.06
    OCK
    -0.06
    piler
    -0.06
    ONA
    -0.06
    otec
    -0.06
    tabl
    -0.06
    POSITIVE LOGITS
    line
    0.08
    oma
    0.07
    duk
    0.07
    der
    0.07
    imbus
    0.07
    tera
    0.07
    ntag
    0.07
    da
    0.06
    ghi
    0.06
    fi
    0.06
    Act Density 0.008%

    No Known Activations