INDEX
    Explanations

    instances of the verb "put" in various forms

    New Auto-Interp
    Negative Logits
    kv
    -0.15
    itez
    -0.14
    rsa
    -0.14
    /from
    -0.14
    imeo
    -0.14
    çĨŁ
    -0.14
    ials
    -0.14
    ories
    -0.14
    sms
    -0.14
    enders
    -0.14
    POSITIVE LOGITS
     forth
    0.39
     aside
    0.35
     together
    0.33
    atively
    0.31
    tering
    0.26
    ter
    0.25
    tered
    0.25
    forth
    0.24
     forward
    0.23
    aside
    0.23
    Act Density 0.042%

    No Known Activations