INDEX
    Explanations

    negative statements and restrictions related to policies or rules

    New Auto-Interp
    Negative Logits
    vell
    -0.17
    ville
    -0.15
    iling
    -0.15
     Adapt
    -0.14
    vier
    -0.14
    Wunused
    -0.14
    tek
    -0.14
     ru
    -0.14
     fair
    -0.13
     organ
    -0.13
    POSITIVE LOGITS
    asca
    0.19
     necessarily
    0.15
    ches
    0.14
    èĢ
    0.14
    ëł
    0.14
     ли
    0.14
    ecs
    0.14
    nelle
    0.14
    erokee
    0.14
    .li
    0.14
    Act Density 0.120%

    No Known Activations