INDEX
    Explanations

    phrases related to discussions or arguments, especially in the context of sports or conflicts, where strong opinions are expressed

    New Auto-Interp
    Negative Logits
     preval
    -0.73
     transact
    -0.71
     unsus
    -0.71
     concess
    -0.70
     unlucky
    -0.69
     abundantly
    -0.69
     intentional
    -0.67
     clerks
    -0.67
     silly
    -0.67
     naughty
    -0.67
    POSITIVE LOGITS
     "â̦
    1.13
     "...
    1.12
     "(
    1.07
     Asked
    1.04
     "[
    1.03
     "'
    1.01
     Adds
    0.98
     However
    0.93
    <|endoftext|>
    0.91
    0.90
    Act Density 0.186%

    No Known Activations