INDEX
    Explanations

    proper nouns and specific terms related to organizations and policies

    New Auto-Interp
    Negative Logits
    krát
    -0.15
    Mocks
    -0.14
    tep
    -0.14
    ¬ģ
    -0.13
    UNG
    -0.13
    iew
    -0.13
    Prov
    -0.13
    ackers
    -0.13
    ÑĥÑĩа
    -0.13
     Karlov
    -0.13
    POSITIVE LOGITS
    y
    0.19
    041
    0.14
    elia
    0.14
    vig
    0.14
    hir
    0.13
    abin
    0.13
    ØŃداث
    0.13
    CppType
    0.13
    ÌĨ
    0.12
    032
    0.12
    Act Density 0.002%

    No Known Activations