INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    north
    -0.07
    -0.07
    不曾
    -0.07
    _inverse
    -0.06
     MessageType
    -0.06
    avi
    -0.06
    -0.06
     Master
    -0.06
    .NORTH
    -0.06
    .after
    -0.06
    POSITIVE LOGITS
    Wunused
    0.08
     зая
    0.08
    *X
    0.07
    MODEL
    0.07
    StateChanged
    0.07
    ;">↵
    0.07
     заявил
    0.07
    äl
    0.07
     Basically
    0.07
    -reviewed
    0.07
    Act Density 0.242%

    No Known Activations