INDEX
    Explanations

    phrases indicating involvement in activities or events

    New Auto-Interp
    Negative Logits
    adt
    -0.18
    uela
    -0.15
    singular
    -0.15
    oa
    -0.15
     Hast
    -0.15
    sak
    -0.15
    rupa
    -0.15
    bane
    -0.14
    icks
    -0.14
    deo
    -0.14
    POSITIVE LOGITS
    ë§Į
    0.16
    ertz
    0.15
    agen
    0.15
    licken
    0.14
     PEM
    0.14
    ischer
    0.14
    æ¶ī
    0.14
    zia
    0.14
    ongo
    0.14
    ulle
    0.14
    Act Density 0.019%

    No Known Activations