INDEX
    Explanations

    phrases indicating contrast or exceptions

    New Auto-Interp
    Negative Logits
    eton
    -0.18
    spath
    -0.16
    otti
    -0.15
     jam
    -0.15
    undle
    -0.15
    berger
    -0.15
    .Api
    -0.14
    ãģ¤ãģ¶
    -0.14
    onden
    -0.14
    sov
    -0.14
    POSITIVE LOGITS
    auga
    0.17
    CHAIN
    0.17
    sz
    0.15
    ainers
    0.14
     Neh
    0.14
    cid
    0.14
     Chan
    0.13
     Skinner
    0.13
    ls
    0.13
    atan
    0.13
    Act Density 0.117%

    No Known Activations