INDEX
    Explanations

    proper nouns representing people

    references to individuals or pronouns indicating people

    New Auto-Interp
    Negative Logits
     feature
    -0.71
     features
    -0.67
     stem
    -0.63
     packed
    -0.62
     preparation
    -0.61
     enc
    -0.61
     primitive
    -0.61
     ends
    -0.60
     depress
    -0.60
     storage
    -0.59
    POSITIVE LOGITS
    who
    3.74
    whose
    2.47
    Who
    1.81
    WHO
    1.79
     whom
    1.78
     who
    1.72
    how
    1.49
    where
    1.43
     Who
    1.41
     WHO
    1.38
    Act Density 0.017%

    No Known Activations