INDEX
    Explanations

    references to voting choices and political decisions

    New Auto-Interp
    Negative Logits
    .boot
    -0.17
    nde
    -0.16
    moz
    -0.15
    IPH
    -0.15
    .cloudflare
    -0.15
    ãĥ³ãĤº
    -0.15
     Normalize
    -0.14
    atica
    -0.14
     toItem
    -0.14
    normalize
    -0.14
    POSITIVE LOGITS
    choice
    0.17
     casting
    0.16
     choice
    0.16
    éĢī
    0.15
     choosing
    0.15
     candidate
    0.15
     alignment
    0.15
    eros
    0.14
    alignment
    0.14
     abst
    0.14
    Act Density 0.178%

    No Known Activations