INDEX
    Explanations

    words related to communication and dialogue

    phrases related to dialogue or speech

    New Auto-Interp
    Negative Logits
     dudes
    -0.66
     shitty
    -0.65
     shit
    -0.62
     wanna
    -0.62
     crap
    -0.62
     swat
    -0.62
     bashing
    -0.62
     trolling
    -0.62
     dude
    -0.61
     titan
    -0.58
    POSITIVE LOGITS
    ogether
    1.00
    sequently
    0.95
    Therefore
    0.94
     Finally
    0.83
    Furthermore
    0.83
     Lastly
    0.82
    Finally
    0.82
     furthermore
    0.78
    Moreover
    0.78
     Afterwards
    0.77
    Act Density 1.082%

    No Known Activations