INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     укреп
    0.51
    Rod
    0.50
    ્રી
    0.50
     suasana
    0.50
     갔다
    0.49
    0.49
    उन्होंने
    0.48
     _{\
    0.48
     общего
    0.48
     τρόπο
    0.48
    POSITIVE LOGITS
     because
    0.67
     not
    0.57
     false
    0.57
     maximise
    0.57
     inutile
    0.57
     BECAUSE
    0.57
     prevention
    0.56
     slowdown
    0.56
     every
    0.56
     Immobilien
    0.56
    Act Density 0.002%

    No Known Activations