INDEX
    Explanations

    words related to truth or honesty

    references to the concept of truth

    New Auto-Interp
    Negative Logits
     elig
    -0.68
     Pharm
    -0.64
     Spice
    -0.61
     fin
    -0.61
     Turk
    -0.57
     disadvant
    -0.57
     Sto
    -0.57
     Peninsula
    -0.57
     Krish
    -0.57
    uled
    -0.56
    POSITIVE LOGITS
    fulness
    1.61
    fully
    1.28
    iness
    1.10
    telling
    1.06
     serum
    1.00
    about
    0.97
    ful
    0.96
    ulence
    0.94
    seekers
    0.90
    orial
    0.88
    Act Density 0.041%

    No Known Activations