INDEX
    Explanations

    instances of emotional and health-related terms

    New Auto-Interp
    Negative Logits
     McMahon
    -0.15
    ä»ĺ
    -0.15
    isans
    -0.14
    ÏĢÎŃ
    -0.14
    yla
    -0.14
    ven
    -0.14
    ickerView
    -0.14
    ारण
    -0.14
    rint
    -0.14
    lect
    -0.14
    POSITIVE LOGITS
     em
    0.17
    phasis
    0.15
    arrass
    0.15
    401
    0.15
    manuel
    0.15
    GENCY
    0.15
    roid
    0.14
    tent
    0.14
    itters
    0.14
     Prec
    0.14
    Act Density 0.055%

    No Known Activations