INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    principalTable
    -0.57
     ainfi
    -0.56
     виправивши
    -0.55
    featureID
    -0.53
    ValueStyle
    -0.52
    BIÉN
    -0.52
    ScopeManager
    -0.50
     miniaturka
    -0.50
     ligiloj
    -0.50
    Jereo
    -0.50
    POSITIVE LOGITS
     betweenstory
    0.44
     Надо
    0.43
    是为了
    0.43
    0.42
    是要
    0.42
    =('
    0.41
     TO
    0.40
     Kno
    0.40
    目的是
    0.40
     drawing
    0.40
    Act Density 0.028%

    No Known Activations