INDEX
    Explanations

    phrases related to casual conversation and social interactions

    conversational expressions and interjections

    New Auto-Interp
    Negative Logits
    sequently
    -0.66
    ourses
    -0.66
    Q
    -0.65
    ourse
    -0.64
    ilst
    -0.64
    20439
    -0.64
    avering
    -0.63
    etermined
    -0.63
    egu
    -0.62
    sufficient
    -0.62
    POSITIVE LOGITS
     freaking
    0.96
     fucking
    0.94
     goddamn
    0.94
     crappy
    0.94
     damn
    0.92
     shitty
    0.92
     nerds
    0.92
     kidding
    0.92
     dudes
    0.90
     crap
    0.89
    Act Density 1.502%

    No Known Activations