INDEX
    Explanations

    references to the concept of "body" in various contexts

    New Auto-Interp
    Negative Logits
    ery
    -0.24
    umber
    -0.21
    eries
    -0.18
    ures
    -0.17
    ally
    -0.16
    atoria
    -0.15
    ophobia
    -0.15
    erna
    -0.15
    ERY
    -0.15
    imum
    -0.15
    POSITIVE LOGITS
    guards
    0.35
    guard
    0.30
     politic
    0.27
    builders
    0.26
    builder
    0.25
    weight
    0.25
    building
    0.24
    wide
    0.23
    mind
    0.19
    674
    0.19
    Act Density 0.041%

    No Known Activations