INDEX
    Explanations

    references to progress and progressive ideologies

    New Auto-Interp
    Negative Logits
    upo
    -0.17
    indow
    -0.17
    pais
    -0.15
    ÑĨе
    -0.15
    icina
    -0.15
    strap
    -0.15
    ijing
    -0.15
    unami
    -0.14
    entai
    -0.14
    aces
    -0.14
    POSITIVE LOGITS
    ions
    0.38
    ional
    0.36
    ion
    0.36
    ively
    0.35
    ivism
    0.35
    ive
    0.30
    iveness
    0.30
    sing
    0.28
    ives
    0.27
    ivity
    0.27
    Act Density 0.019%

    No Known Activations