INDEX
    Explanations

    phrases related to taking action or fighting back

    phrases that express resistance or standing up against challenges

    New Auto-Interp
    Negative Logits
    chell
    -0.79
    ãĥ¼ãĥ«
    -0.78
    address
    -0.76
    arlane
    -0.71
     Helpful
    -0.68
    clair
    -0.68
    chnology
    -0.68
    ":-
    -0.66
    çīĪ
    -0.66
    hee
    -0.64
    POSITIVE LOGITS
     sqor
    0.87
    enegger
    0.87
    ardless
    0.79
     attrition
    0.75
     urge
    0.71
     against
    0.70
     extinction
    0.69
     raged
    0.69
    ipeg
    0.69
     survival
    0.68
    Act Density 0.148%

    No Known Activations