INDEX
    Explanations

    words related to negative attributes or criticisms

    words related to imbalance or imperfections

    New Auto-Interp
    Negative Logits
    NetMessage
    -0.87
    ttes
    -0.79
    Downloadha
    -0.75
    DragonMagazine
    -0.70
     Warwick
    -0.69
     Mississ
    -0.68
     slash
    -0.67
     Morales
    -0.67
     Rav
    -0.66
     Witches
    -0.65
    POSITIVE LOGITS
    balanced
    1.17
    unity
    1.14
    itated
    1.11
    itating
    1.09
    itates
    1.04
    mer
    1.04
    mediate
    1.04
    medi
    1.03
    press
    1.02
    ply
    1.01
    Act Density 0.007%

    No Known Activations