INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    mut
    -0.07
     Telefon
    -0.07
    átky
    -0.07
     III
    -0.07
     Commit
    -0.06
     tersebut
    -0.06
     stationary
    -0.06
    ockey
    -0.06
    ­tion
    -0.06
     Journal
    -0.06
    POSITIVE LOGITS
     fast
    0.08
    iden
    0.08
    0.07
    -fast
    0.06
     faster
    0.06
     slow
    0.06
    grese
    0.06
     kou
    0.06
     Пост
    0.06
    Pres
    0.06
    Act Density 0.022%

    No Known Activations