INDEX
    Explanations

    phrases related to addressing issues, making policy changes, and connecting with people in various contexts

    New Auto-Interp
    Negative Logits
    astical
    -0.89
    effects
    -0.80
    robe
    -0.75
    icity
    -0.72
    ardless
    -0.71
    claimed
    -0.68
    ventions
    -0.67
     similarly
    -0.65
    similar
    -0.64
    Cas
    -0.63
    POSITIVE LOGITS
     Heller
    0.65
     Canaver
    0.65
     Mehran
    0.64
     nutshell
    0.63
     Scrib
    0.63
     Palin
    0.62
     Hannity
    0.61
     Leh
    0.60
     motivating
    0.60
     Rudd
    0.60
    Act Density 1.153%

    No Known Activations