INDEX
    Explanations

    Russian language

    New Auto-Interp
    Negative Logits
     Nixon
    -0.08
    ^(
    -0.08
    anko
    -0.07
    (pro
    -0.07
     Dod
    -0.07
    .error
    -0.07
    ^^
    -0.07
     combining
    -0.07
    hanga
    -0.07
    VS
    -0.07
    POSITIVE LOGITS
     forgotten
    0.08
     breath
    0.08
     pono
    0.08
     allegiance
    0.08
    了一
    0.07
     પોતાની
    0.07
     succumb
    0.07
    estros
    0.07
     accustomed
    0.07
     forgot
    0.07
    Act Density 0.010%

    No Known Activations