INDEX
    Explanations

    references to obedience and compliance to authority

    New Auto-Interp
    Negative Logits
    ÑģÑĤÑĭ
    -0.16
    à¹Īาย
    -0.16
     Lomb
    -0.16
    sth
    -0.15
     Engel
    -0.14
    enburg
    -0.14
    olio
    -0.14
    ع
    -0.14
    ãĥ§
    -0.14
    irler
    -0.14
    POSITIVE LOGITS
    urate
    0.16
    ÃŃch
    0.16
    eel
    0.16
    currentColor
    0.15
     sexist
    0.15
    longleftrightarrow
    0.14
    FUL
    0.14
    Batch
    0.14
    .opendaylight
    0.14
    cott
    0.13
    Act Density 0.006%

    No Known Activations