INDEX
    Explanations

    references to the word "Lie" or variations of it

    occurrences of the word "lie."

    New Auto-Interp
    Negative Logits
    arthy
    -0.77
     oun
    -0.73
    smart
    -0.72
    iaries
    -0.71
    irlf
    -0.70
    atform
    -0.67
    uploads
    -0.66
    iles
    -0.66
    detail
    -0.66
    icable
    -0.65
    POSITIVE LOGITS
    utenant
    1.47
     Lie
    1.00
    ge
    0.94
    uten
    0.91
    berman
    0.88
    ÃŁ
    0.86
     detector
    0.84
    pard
    0.83
    Lie
    0.83
    yer
    0.81
    Act Density 0.023%

    No Known Activations