INDEX
    Explanations

    mentions of names of people or organizations

    proper names, particularly individuals and organizations related to media, politics, and events

    New Auto-Interp
    Negative Logits
    atform
    -0.83
    FG
    -0.70
    quarters
    -0.67
    leneck
    -0.65
    etheus
    -0.64
     psychiat
    -0.64
     incom
    -0.63
     skelet
    -0.62
    £ı
    -0.61
    yright
    -0.60
    POSITIVE LOGITS
    belt
    0.81
    rome
    0.70
    uous
    0.69
     Hera
    0.69
    oga
    0.67
    utsche
    0.67
    cheat
    0.67
    atchewan
    0.66
    pants
    0.65
    pta
    0.64
    Act Density 0.460%

    No Known Activations