INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Participant
    -0.07
    pattern
    -0.06
    \xe
    -0.06
     note
    -0.06
     kost
    -0.06
     Participant
    -0.06
     keinen
    -0.06
     vrch
    -0.06
    mind
    -0.06
    TestCategory
    -0.06
    POSITIVE LOGITS
     наличи
    0.07
    _shell
    0.07
    neo
    0.07
    ـل
    0.06
     ANSI
    0.06
    fit
    0.06
     mechanism
    0.06
    ibbon
    0.06
    RSpec
    0.06
    rabilir
    0.06
    Act Density 0.048%

    No Known Activations