INDEX
    Explanations

    occurrences of specific foreign characters or characters from a different encoding

    New Auto-Interp
    Negative Logits
    волÑı
    -0.20
    надлеж
    -0.20
    endale
    -0.17
    FromBody
    -0.17
    quia
    -0.16
    коÑĢиÑģÑĤ
    -0.16
    rtle
    -0.16
    edBy
    -0.15
    меÑĪ
    -0.15
    оÑĢÑĥж
    -0.15
    POSITIVE LOGITS
    rench
    0.17
    apan
    0.17
    n
    0.15
     hence
    0.15
    beat
    0.15
    /to
    0.15
    ni
    0.15
    yssey
    0.15
     consequence
    0.15
    m
    0.14
    Act Density 0.006%

    No Known Activations