INDEX
    Explanations

    words related to performance assessment and comparisons

    New Auto-Interp
    Negative Logits
    ãĤ·ãĥ¼
    -0.16
    zá
    -0.15
     Bes
    -0.14
    kker
    -0.14
    hausen
    -0.14
    pany
    -0.14
    urn
    -0.13
    иной
    -0.13
    ante
    -0.13
    olare
    -0.13
    POSITIVE LOGITS
     elsewhere
    0.25
     identical
    0.19
    åľ¨
    0.18
     ợ
    0.17
     abroad
    0.16
     á»ŀ
    0.16
    ignum
    0.16
     åľ¨
    0.16
    same
    0.15
    .twig
    0.15
    Act Density 0.267%

    No Known Activations