INDEX
    Explanations

    instances of the word "honest"

    New Auto-Interp
    Negative Logits
    acid
    -0.74
    joined
    -0.71
    chairs
    -0.68
     Krish
    -0.68
    arthy
    -0.67
    interrupted
    -0.67
    Ĥİ
    -0.67
    İĭ
    -0.66
    ombat
    -0.65
    lav
    -0.65
    POSITIVE LOGITS
     broker
    0.87
     honest
    0.87
     appraisal
    0.82
     truthful
    0.79
     honesty
    0.77
     candid
    0.73
    spection
    0.70
     frank
    0.69
     mistake
    0.69
    Reporting
    0.68
    Act Density 0.025%

    No Known Activations