INDEX
    Explanations

    terms related to falsehoods or deceiving statements

    occurrences of the word "lies."

    New Auto-Interp
    Negative Logits
     zoom
    -0.66
     Chapel
    -0.63
     crisp
    -0.61
     quint
    -0.61
     better
    -0.58
     curb
    -0.58
     speed
    -0.57
     dash
    -0.56
     tattoo
    -0.56
     Singer
    -0.56
    POSITIVE LOGITS
    lies
    5.06
    lie
    2.15
    lied
    1.86
    lying
    1.84
    liest
    1.59
    pins
    1.51
    lier
    1.43
    liness
    1.38
    mares
    1.30
    li
    1.26
    Act Density 0.008%

    No Known Activations