INDEX
    Explanations

    instances of specific verbs related to observation and actions

    New Auto-Interp
    Negative Logits
    pls
    -0.15
    achen
    -0.15
    PELL
    -0.15
    AFE
    -0.15
    itta
    -0.14
    itler
    -0.14
    itte
    -0.14
    uali
    -0.14
    anson
    -0.14
    arring
    -0.14
    POSITIVE LOGITS
     worden
    0.37
     werden
    0.28
     wird
    0.23
     becoming
    0.23
     wurde
    0.23
     Bec
    0.21
     wordt
    0.21
     become
    0.20
     becomes
    0.20
    bec
    0.19
    Act Density 0.012%

    No Known Activations