INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.09
    连续
    -0.08
     chaque
    -0.08
    -0.08
     часы
    -0.08
     තිබ
    -0.08
     jardín
    -0.08
     árbol
    -0.08
     besonder
    -0.08
    gry
    -0.08
    POSITIVE LOGITS
    947
    0.08
     Nathan
    0.08
    473
    0.08
    -based
    0.08
    _eta
    0.07
    _certificate
    0.07
    attano
    0.07
    owitz
    0.07
    0.07
     initially
    0.07
    Act Density 0.028%

    No Known Activations