INDEX
    Explanations

    references to specific people or events, especially related to announcements or declarations

    references to specific individuals or events in sports and culture

    New Auto-Interp
    Negative Logits
     keyboards
    -0.57
     slur
    -0.51
     fuzz
    -0.51
     Occasionally
    -0.51
     thyroid
    -0.47
     tidy
    -0.46
     innocuous
    -0.45
     manic
    -0.44
     feminine
    -0.44
     stereotype
    -0.44
    POSITIVE LOGITS
    cember
    0.63
    OUP
    0.60
    ETHOD
    0.59
    numbered
    0.58
    DonaldTrump
    0.57
    HQ
    0.55
    itialized
    0.54
    DEM
    0.54
    razil
    0.54
    ģ«
    0.53
    Act Density 2.629%

    No Known Activations