INDEX
    Explanations

    phrases related to honesty and transparency

    New Auto-Interp
    Negative Logits
    Ñĩи
    -0.15
    sea
    -0.15
    isz
    -0.14
    ابت
    -0.14
     reli
    -0.14
    stå
    -0.14
    اغ
    -0.14
    oola
    -0.14
    ISMATCH
    -0.14
    defgroup
    -0.14
    POSITIVE LOGITS
     honest
    0.27
     honesty
    0.23
     candid
    0.21
     frank
    0.19
     honestly
    0.17
     Kauf
    0.17
     admit
    0.16
    ecta
    0.15
    ãĥ©ãĥ³ãĥī
    0.15
     Honest
    0.15
    Act Density 0.082%

    No Known Activations