INDEX
    Explanations

    references to societal concepts and institutions

    New Auto-Interp
    Negative Logits
    -0.67
     Prin
    -0.65
    cur
    -0.64
    tral
    -0.63
    ok
    -0.62
    ال
    -0.60
    𝓪
    -0.59
    рас
    -0.59
    р
    -0.59
    amal
    -0.59
    POSITIVE LOGITS
     Society
    2.02
     Societies
    2.00
     SOCIETY
    1.99
     societies
    1.97
     society
    1.95
    Society
    1.94
    society
    1.86
     sociedad
    1.35
     Gesellschaft
    1.28
     sociedade
    1.25
    Act Density 0.058%

    No Known Activations