INDEX
    Explanations

    Flammability

    New Auto-Interp
    Negative Logits
    plat
    -0.07
    take
    -0.07
     EPS
    -0.07
    .':
    -0.07
     glued
    -0.07
     INLINE
    -0.07
    outside
    -0.07
    etta
    -0.07
    Heat
    -0.07
    related
    -0.06
    POSITIVE LOGITS
     вним
    0.09
    amma
    0.08
    amm
    0.08
    0.07
    /twitter
    0.07
    ارف
    0.06
    овари
    0.06
    0.06
     Nash
    0.06
     Voll
    0.06
    Act Density 0.004%

    No Known Activations