INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     situ
    -0.10
     Cil
    -0.08
     swimmer
    -0.08
     slipped
    -0.08
     simptom
    -0.08
    elp
    -0.08
     regelmäß
    -0.07
     zvý
    -0.07
    _sheet
    -0.07
    _terminal
    -0.07
    POSITIVE LOGITS
    usstsein
    0.08
     विरोध
    0.08
    _JSON
    0.08
     ارتباط
    0.08
     emphas
    0.07
    ుడు
    0.07
     JSON
    0.07
    աշին
    0.07
     آلات
    0.07
     converge
    0.07
    Act Density 0.004%

    No Known Activations