INDEX
    Explanations

    instances of the word "lying" and its variations

    New Auto-Interp
    Negative Logits
     McCormack
    -0.66
     NCC
    -0.66
     Momb
    -0.64
     ALF
    -0.64
    GPP
    -0.64
     FEC
    -0.63
     IFT
    -0.63
     CDP
    -0.62
    FEC
    -0.62
     ESM
    -0.62
    POSITIVE LOGITS
     lying
    1.64
     lie
    1.43
     Lying
    1.41
    Lying
    1.38
     lies
    1.34
    lying
    1.23
     Lies
    1.17
     Lie
    1.08
    Lies
    1.05
    Lie
    0.96
    Act Density 0.006%

    No Known Activations