INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     exerts
    0.78
     unchallenged
    0.76
    0.76
    steroidal
    0.75
    0.75
     feiern
    0.74
    unistd
    0.73
     ausschließlich
    0.72
     freep
    0.72
     दशहरा
    0.71
    POSITIVE LOGITS
     +"
    1.75
     +
    1.67
    +",
    1.55
    +"
    1.52
     +'
    1.31
     +"\
    1.31
    +'
    1.29
    +',
    1.27
     $+$
    1.24
    +"\
    1.22
    Act Density 0.529%

    No Known Activations