INDEX
    Explanations

    inherent superiority and inequalities

    New Auto-Interp
    Negative Logits
    And
    0.50
     And
    0.49
    成功
    0.43
    0.43
    Λ
    0.42
     রহস্য
    0.40
    所以
    0.40
    希少
    0.40
    ί
    0.40
    ts
    0.40
    POSITIVE LOGITS
     ideologies
    0.51
     slums
    0.48
     algebras
    0.48
     curricula
    0.47
     myocard
    0.45
     propagand
    0.45
     protesters
    0.45
     bureaucrats
    0.44
     unfit
    0.44
     warships
    0.43
    Act Density 0.002%

    No Known Activations