INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    不了
    -0.08
    -0.08
     powied
    -0.08
    -minded
    -0.08
    Txn
    -0.07
    回答
    -0.07
     transactional
    -0.07
     terro
    -0.07
    -0.07
     rhin
    -0.07
    POSITIVE LOGITS
     вверх
    0.10
     menuju
    0.09
     warped
    0.08
     вниз
    0.08
     omhoog
    0.08
     fleet
    0.08
    arched
    0.07
     taller
    0.07
     mont
    0.07
     terminating
    0.07
    Act Density 0.003%

    No Known Activations