INDEX
    Explanations

    adverbs expressing sincerity or truthfulness

    expressions of honesty and frankness

    New Auto-Interp
    Negative Logits
     Landing
    -0.69
    arthy
    -0.67
     Vaj
    -0.65
    lav
    -0.63
    href
    -0.62
    etting
    -0.62
     Klu
    -0.61
     Blades
    -0.60
    activated
    -0.60
    indal
    -0.59
    POSITIVE LOGITS
     speaking
    0.89
    zers
    0.84
    é¾įåĸļ士
    0.70
    speaking
    0.69
     honestly
    0.67
    ,,,,
    0.66
    cohol
    0.66
    âĸijâĸij
    0.66
    onom
    0.65
    bear
    0.63
    Act Density 0.028%

    No Known Activations