INDEX
    Explanations

    simple, reflecting, proportionally

    New Auto-Interp
    Negative Logits
     decision
    0.40
    Moz
    0.40
    0.39
    ივ
    0.38
    adó
    0.37
     evolutionary
    0.37
    0.37
    Mozilla
    0.36
     decisions
    0.36
    0.36
    POSITIVE LOGITS
    ग्रस्त
    0.42
     Wab
    0.39
     කරයි
    0.38
     disheart
    0.38
     tightened
    0.38
     erle
    0.37
     किफ
    0.37
    کی
    0.36
     бер
    0.36
     мате
    0.35
    Act Density 0.000%

    No Known Activations