INDEX
    Explanations

    words related to grotesque or horrifying imagery

    terms related to extreme or violent situations

    New Auto-Interp
    Negative Logits
    BALL
    -0.76
    horn
    -0.72
    ACP
    -0.72
    WARE
    -0.68
    MQ
    -0.66
    roads
    -0.66
    fields
    -0.64
     Logged
    -0.64
    STD
    -0.62
    wine
    -0.62
    POSITIVE LOGITS
    ities
    1.25
    ity
    1.24
    itous
    1.18
    acies
    1.08
    inals
    1.08
    als
    1.00
    itors
    0.98
    acy
    0.97
    ians
    0.97
    agi
    0.96
    Act Density 0.035%

    No Known Activations