INDEX
    Explanations

    words related to the human body

    mentions of body and body image

    New Auto-Interp
    Negative Logits
     Hoover
    -0.80
     Kafka
    -0.71
     Clover
    -0.71
     Ans
    -0.68
    ãĥĻ
    -0.68
     Pis
    -0.68
     Booth
    -0.67
     Nex
    -0.67
     Dickens
    -0.62
     Jarrett
    -0.62
    POSITIVE LOGITS
    guards
    1.34
    builders
    1.19
    guard
    1.19
    building
    1.18
    builder
    1.14
    weight
    1.02
    parts
    1.01
     politic
    0.94
    wash
    0.92
    anguage
    0.91
    Act Density 0.033%

    No Known Activations