INDEX
    Explanations

    phrases indicating approval or support

    expressions related to voting in favor of proposals or legislation

    New Auto-Interp
    Negative Logits
     Brist
    -0.71
    tis
    -0.69
     Gorge
    -0.64
    LES
    -0.64
    ı
    -0.63
    legged
    -0.63
    ridges
    -0.63
    liam
    -0.61
     Pist
    -0.61
     Torn
    -0.60
    POSITIVE LOGITS
    itism
    1.25
    ability
    0.79
    ative
    0.76
     favoring
    0.75
    ably
    0.75
    ality
    0.74
    uate
    0.72
    parency
    0.71
    ibility
    0.71
    atives
    0.71
    Act Density 0.016%

    No Known Activations