INDEX
    Explanations

    personal pronouns referring to people

    New Auto-Interp
    Negative Logits
    opter
    -0.58
    mber
    -0.57
     Cong
    -0.56
     Abbey
    -0.53
     rush
    -0.52
     source
    -0.50
     Huff
    -0.49
     sep
    -0.49
    IDENT
    -0.49
    isbury
    -0.49
    POSITIVE LOGITS
     etc
    1.41
    etc
    1.26
    whatever
    0.94
     blah
    0.83
     ..............
    0.77
    ĪĴ
    0.76
    anything
    0.75
     oh
    0.70
     â̦
    0.69
     whatever
    0.67
    Act Density 0.346%

    No Known Activations