INDEX
    Explanations

    expressions of honesty or frankness

    New Auto-Interp
    Negative Logits
    stination
    -0.50
    landais
    -0.46
    جمعیت
    -0.44
     arşivlendi
    -0.44
     wireType
    -0.44
     ligiloj
    -0.43
    imc
    -0.43
     isolado
    -0.43
     VIDEOT
    -0.43
    PyExc
    -0.43
    POSITIVE LOGITS
     frankly
    0.77
    Honestly
    0.76
    Frankly
    0.74
    honestly
    0.70
     Honestly
    0.70
     honestly
    0.69
    Tbh
    0.59
     tbh
    0.58
    正直
    0.56
     honest
    0.54
    Act Density 0.009%

    No Known Activations