INDEX
    Explanations

    Code/Markup

    New Auto-Interp
    Negative Logits
     movie
    -0.07
     detox
    -0.06
     disco
    -0.06
     omit
    -0.06
    //:
    -0.06
     yapı
    -0.06
     яку
    -0.06
    -processing
    -0.06
    ňování
    -0.06
    "'
    -0.06
    POSITIVE LOGITS
    Paragraph
    0.06
    adget
    0.06
    apiro
    0.06
    =======
    0.06
    اهش
    0.06
    اری
    0.05
    0.05
    _dice
    0.05
     plated
    0.05
    =Math
    0.05
    Act Density 0.030%

    No Known Activations