INDEX
    Explanations

    references to deception or pretense in political and social contexts

    New Auto-Interp
    Negative Logits
    anki
    -0.16
    ียà¸Ķ
    -0.16
    è®
    -0.15
    chner
    -0.14
     zdrav
    -0.14
    ynos
    -0.14
    anke
    -0.14
    atron
    -0.14
    atu
    -0.14
    aptive
    -0.14
    POSITIVE LOGITS
     somehow
    0.20
     superior
    0.18
     representing
    0.17
     expertise
    0.17
     experts
    0.17
     sophistication
    0.17
    represent
    0.15
     progress
    0.15
     authority
    0.15
     victim
    0.14
    Act Density 0.149%

    No Known Activations