INDEX
    Explanations

    instances of the word "ban" and its variations

    New Auto-Interp
    Negative Logits
    oÄŁ
    -0.17
    ÛĮ
    -0.16
    chten
    -0.15
    oze
    -0.15
    klass
    -0.14
    elig
    -0.14
    ãĤĥ
    -0.14
    ODO
    -0.14
    aÄŁ
    -0.14
    eon
    -0.13
    POSITIVE LOGITS
    ishment
    0.32
    ished
    0.30
    offee
    0.29
    quets
    0.28
    tering
    0.25
    ishing
    0.25
    jo
    0.24
    ister
    0.23
    jax
    0.23
    anas
    0.23
    Act Density 0.013%

    No Known Activations