INDEX
    Explanations

    pronouns referring to people or things

    pronouns referring to people

    New Auto-Interp
    Negative Logits
    ielding
    -0.78
    east
    -0.68
    semb
    -0.68
    shown
    -0.66
     Pwr
    -0.66
    cond
    -0.65
    ricanes
    -0.65
    Rex
    -0.63
    Fla
    -0.63
    athon
    -0.62
    POSITIVE LOGITS
     Majesty
    1.00
    illac
    0.81
     majesty
    0.79
     fucking
    0.78
    mos
    0.77
     smokes
    0.77
     fuckin
    0.76
     fucked
    0.72
     behav
    0.71
     Sly
    0.69
    Act Density 0.487%

    No Known Activations