INDEX
    Explanations

    phrases that indicate actions, particularly those involving claims, allegations, or statements about individuals or entities

    New Auto-Interp
    Negative Logits
    onom
    -0.15
    reesome
    -0.15
    æļ®
    -0.15
    ÃĹ↵↵
    -0.14
    arkan
    -0.14
    λικ
    -0.14
    blick
    -0.14
     INTERRUPTION
    -0.14
    ubo
    -0.14
     frei
    -0.14
    POSITIVE LOGITS
    undle
    0.15
    gle
    0.15
    ingo
    0.14
     finally
    0.14
    pies
    0.14
    üs
    0.13
    yle
    0.13
    ISTER
    0.13
     bamboo
    0.13
    Uvs
    0.13
    Act Density 0.057%

    No Known Activations