INDEX
    Explanations

    instances of the word "honest"

    instances of the word "honest."

    New Auto-Interp
    Negative Logits
    LAN
    -0.86
    CHAT
    -0.73
     Krish
    -0.72
    617
    -0.72
    515
    -0.69
    lav
    -0.69
    acid
    -0.69
     Libraries
    -0.68
    avia
    -0.67
    KEN
    -0.67
    POSITIVE LOGITS
     honest
    0.91
     honesty
    0.85
     broker
    0.82
     truthful
    0.73
     princ
    0.73
    urance
    0.71
     sounding
    0.70
    lly
    0.70
     frank
    0.70
     Honest
    0.68
    Act Density 0.015%

    No Known Activations