INDEX
    Explanations

    expressions indicating honesty or frankness

    expressions of honesty or frankness in opinions

    New Auto-Interp
    Negative Logits
    etting
    -0.71
     Landing
    -0.70
     Blades
    -0.70
    ied
    -0.62
    lings
    -0.61
    arthy
    -0.61
    tailed
    -0.61
    akh
    -0.59
    vich
    -0.59
    lav
    -0.59
    POSITIVE LOGITS
     speaking
    0.97
    é¾įåĸļ士
    0.86
    zers
    0.82
     honestly
    0.74
    ometry
    0.73
    speaking
    0.71
     doubted
    0.67
     admit
    0.67
    odox
    0.67
     surprised
    0.66
    Act Density 0.034%

    No Known Activations