INDEX
    Explanations

    phrases indicating desire or intention

    expressions of desire or intent

    New Auto-Interp
    Negative Logits
     Fowler
    -0.65
    rir
    -0.61
     Notting
    -0.61
    Condition
    -0.60
     guiActiveUn
    -0.59
    rium
    -0.58
    ilial
    -0.58
    roach
    -0.57
    may
    -0.57
    MRI
    -0.56
    POSITIVE LOGITS
     sake
    0.68
     honesty
    0.67
     answers
    0.67
     cleaned
    0.64
     clarity
    0.63
     honest
    0.63
     smoot
    0.63
     revenge
    0.63
    thood
    0.63
     rebuilt
    0.62
    Act Density 0.163%

    No Known Activations