INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     my
    -0.11
     your
    -0.10
    My
    -0.08
     you
    -0.08
    -my
    -0.07
    your
    -0.07
     YOUR
    -0.07
     his
    -0.07
     My
    -0.07
     her
    -0.07
    POSITIVE LOGITS
    =params
    0.06
     alış
    0.06
     kho
    0.06
    /Sh
    0.06
     True
    0.06
     Conversely
    0.06
     simplement
    0.06
     podařilo
    0.06
    0.06
    backup
    0.06
    Act Density 0.289%

    No Known Activations