INDEX
    Explanations

    words related to social issues and criticisms of authority

    New Auto-Interp
    Negative Logits
    amu
    -0.17
     Dare
    -0.15
    anchise
    -0.15
    jour
    -0.15
     th
    -0.15
     tac
    -0.15
     PD
    -0.14
     hak
    -0.14
    PD
    -0.14
    ä¹Ī
    -0.14
    POSITIVE LOGITS
    arus
    0.15
    otec
    0.14
    .Classes
    0.14
    šku
    0.14
    наÑĢод
    0.14
    cz
    0.14
     Rangers
    0.14
    (Table
    0.14
    ogh
    0.14
    avicon
    0.14
    Act Density 0.028%

    No Known Activations