INDEX
    Explanations

    past participles of verbs

    New Auto-Interp
    Negative Logits
    i
    -0.22
    asted
    -0.17
    lek
    -0.16
    ież
    -0.16
    s
    -0.15
    marvin
    -0.15
    adows
    -0.15
    incident
    -0.15
    ologne
    -0.15
    olvers
    -0.15
    POSITIVE LOGITS
    dy
    0.26
    gy
    0.25
    ifice
    0.24
    icts
    0.22
    d
    0.21
    die
    0.21
    ema
    0.21
    ging
    0.21
    uction
    0.20
    dress
    0.20
    Act Density 0.007%

    No Known Activations