INDEX
    Explanations

    phrases related to body parts and actions

    possessive pronouns and references to ownership

    New Auto-Interp
    Negative Logits
     hereafter
    -0.73
     Rowling
    -0.73
     Pwr
    -0.67
    GAN
    -0.66
    '-
    -0.65
     Eucl
    -0.64
     Hear
    -0.64
     Izan
    -0.62
    ablishment
    -0.62
     Aren
    -0.61
    POSITIVE LOGITS
     own
    1.38
     fingers
    1.05
     knees
    1.01
    selves
    1.00
     nose
    0.98
     fists
    0.96
    self
    0.96
     toes
    0.94
     hips
    0.93
     lips
    0.91
    Act Density 0.146%

    No Known Activations