INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ça
    0.40
     settling
    0.38
    stabil
    0.37
     multiplicado
    0.36
     everywhere
    0.36
    elon
    0.36
     सारे
    0.36
    compile
    0.36
     பால்
    0.35
    alcool
    0.35
    POSITIVE LOGITS
    0.43
    ('<
    0.42
    0.40
    価値
    0.38
    ทย์
    0.38
     differentiator
    0.38
     cấu
    0.37
     Embedded
    0.36
     уровне
    0.36
    မျိုး
    0.36
    Act Density 0.000%

    No Known Activations