INDEX
    Explanations

    phrases related to strength and weakness

    words and phrases related to perceptions of weakness and strength

    New Auto-Interp
    Negative Logits
    mentioned
    -0.67
    tions
    -0.65
    iland
    -0.62
    tails
    -0.59
    =#
    -0.59
    ancies
    -0.56
    anooga
    -0.56
    sequently
    -0.55
    undo
    -0.54
    arton
    -0.53
    POSITIVE LOGITS
     underdog
    0.62
     savior
    0.61
     rog
    0.60
     inferior
    0.60
     discipl
    0.60
    chwitz
    0.57
    ".[
    0.57
    pic
    0.56
     coward
    0.56
     utilitarian
    0.56
    Act Density 0.738%

    No Known Activations