INDEX
    Explanations

    phrases emphasizing honesty

    expressions related to honesty

    New Auto-Interp
    Negative Logits
     Krish
    -0.76
     Libraries
    -0.74
    LAN
    -0.74
    interrupted
    -0.72
    acid
    -0.70
    enegger
    -0.70
     Autism
    -0.69
    levels
    -0.68
    515
    -0.67
    Stud
    -0.67
    POSITIVE LOGITS
     honest
    1.08
     truthful
    0.92
     honesty
    0.85
    cipled
    0.78
     broker
    0.76
    onest
    0.76
     dece
    0.73
    parency
    0.71
     Honest
    0.70
    rencies
    0.70
    Act Density 0.009%

    No Known Activations