INDEX
    Explanations

    references to institutions, organizations, or formal entities

    New Auto-Interp
    Negative Logits
    aris
    -0.15
    ken
    -0.15
    ноÑĩ
    -0.15
    otte
    -0.14
    65
    -0.14
    acle
    -0.14
    uding
    -0.14
    urdu
    -0.14
    orne
    -0.14
    weise
    -0.14
    POSITIVE LOGITS
     described
    0.23
     mentioned
    0.22
    explained
    0.19
    -described
    0.19
     discussed
    0.19
    chers
    0.17
    disc
    0.16
    mentioned
    0.16
     shown
    0.16
     explained
    0.16
    Act Density 0.003%

    No Known Activations