INDEX
    Explanations

    references to spreading rumors or unfounded claims

    phrases indicating rumors or accusations

    New Auto-Interp
    Negative Logits
    atre
    -0.72
    borg
    -0.66
    pling
    -0.65
    onomic
    -0.62
    ocks
    -0.62
    ien
    -0.61
    ouk
    -0.61
    osures
    -0.61
    waters
    -0.61
    ey
    -0.60
    POSITIVE LOGITS
     accompanies
    0.98
    soever
    0.92
     arose
    0.87
     preceded
    0.84
     they
    0.77
     contradicts
    0.77
    ©¶æ
    0.76
     accompanied
    0.75
     contradicted
    0.73
     surrounds
    0.72
    Act Density 0.192%

    No Known Activations