INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     empathy
    -0.07
     кроме
    -0.07
    -0.06
    _enqueue
    -0.06
    avascript
    -0.06
     dolphins
    -0.06
     hồ
    -0.06
    -0.06
     Alexandre
    -0.06
     verschill
    -0.06
    POSITIVE LOGITS
    )."
    0.08
     reminiscent
    0.07
    _write
    0.06
    (integer
    0.06
     Mong
    0.06
    Balance
    0.06
    Instrument
    0.06
    )를
    0.06
    цез
    0.06
    ournaments
    0.06
    Act Density 0.195%

    No Known Activations