INDEX
    Explanations

    words related to action and violence

    New Auto-Interp
    Negative Logits
    gling
    -0.72
    orf
    -0.63
    inately
    -0.63
     Fey
    -0.61
    olls
    -0.61
     porous
    -0.61
    owship
    -0.60
    oiler
    -0.60
    ringe
    -0.60
    mbuds
    -0.59
    POSITIVE LOGITS
    ives
    0.93
    ivism
    0.91
    ivated
    0.89
     Replay
    0.86
    iveness
    0.85
    ality
    0.80
    able
    0.80
    ual
    0.80
    aries
    0.79
    uated
    0.77
    Act Density 0.482%

    No Known Activations