INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     disasters
    -0.99
     sufferers
    -0.97
     disaster
    -0.94
     Мексичка
    -0.94
     avoient
    -0.92
    Disaster
    -0.91
    NOPQRST
    -0.90
     disastrous
    -0.88
     normaux
    -0.87
     étoient
    -0.85
    POSITIVE LOGITS
    ly
    0.75
     of
    0.62
    acy
    0.56
     for
    0.55
    ist
    0.55
    ness
    0.53
    ante
    0.53
    ism
    0.53
    ized
    0.52
     (
    0.51
    Act Density 0.063%

    No Known Activations