INDEX
    Explanations

    phrases related to significant actions or events, often involving some kind of emotional weight

    references to specific actions or events labeled as "acts."

    New Auto-Interp
    Negative Logits
     corners
    -0.81
     ceilings
    -0.73
     Flavoring
    -0.69
     edges
    -0.67
     Pavilion
    -0.66
     sshd
    -0.65
     Pyramid
    -0.63
     Wheat
    -0.63
     walls
    -0.63
    aways
    -0.61
    POSITIVE LOGITS
     luck
    0.90
     sabotage
    0.85
     kindness
    0.80
     heroism
    0.78
     omission
    0.76
     aggression
    0.71
     desperation
    0.69
    asses
    0.68
    OUP
    0.67
     vandalism
    0.67
    Act Density 0.060%

    No Known Activations