INDEX
    Explanations

    phrases indicating frequency or repetition

    New Auto-Interp
    Negative Logits
    formace
    -0.17
    sert
    -0.15
    lix
    -0.15
    â̦”↵↵
    -0.15
    llib
    -0.15
    voÅĻ
    -0.15
    abbo
    -0.15
    IBUTE
    -0.14
     somehow
    -0.14
    ยม
    -0.14
    POSITIVE LOGITS
    ìĶ©
    0.21
    -times
    0.21
     even
    0.20
    place
    0.20
     referred
    0.17
    ľ
    0.17
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    0.16
    even
    0.15
    kus
    0.15
     даже
    0.15
    Act Density 0.017%

    No Known Activations