INDEX
    Explanations

    phrases related to accountability and public scrutiny of figures in positions of power

    New Auto-Interp
    Negative Logits
    erot
    -0.17
    Animate
    -0.16
    tel
    -0.15
    hol
    -0.15
     Animalia
    -0.15
    lg
    -0.14
    Inlining
    -0.14
    entai
    -0.14
    ham
    -0.14
     Marcos
    -0.14
    POSITIVE LOGITS
    ulet
    0.16
     bond
    0.15
    _ASC
    0.14
    ÑģÑĤв
    0.14
    ::$
    0.14
    abee
    0.14
    _conditions
    0.14
    δÏģο
    0.13
    elocity
    0.13
    altern
    0.13
    Act Density 0.008%

    No Known Activations