INDEX
    Explanations

    words related to negative outcomes or situations

    New Auto-Interp
    Negative Logits
    jednoc
    -0.76
    PhysRevD
    -0.73
     connexes
    -0.72
    rijke
    -0.72
    IUrlHelper
    -0.70
    GOTREF
    -0.70
     Laing
    -0.68
     Bourgoin
    -0.68
    ίκη
    -0.66
    cupertino
    -0.66
    POSITIVE LOGITS
     worse
    1.29
    Worse
    1.22
     Worse
    1.17
     worst
    1.14
     Worst
    1.07
    worse
    1.07
     bad
    1.05
     BAD
    1.05
     Bad
    1.02
    Worst
    1.02
    Act Density 0.147%

    No Known Activations