INDEX
    Explanations

    phrases indicating disagreement or debate

    instances of the word "argue" and its variations

    New Auto-Interp
    Negative Logits
     Seym
    -0.67
    beam
    -0.67
    cy
    -0.66
    forms
    -0.64
    finger
    -0.63
    fing
    -0.61
    psy
    -0.61
    Adds
    -0.60
    photos
    -0.59
    gallery
    -0.58
    POSITIVE LOGITS
     against
    0.99
     persu
    0.92
    ative
    0.90
     vehemently
    0.90
     convinc
    0.89
    atively
    0.88
     forcefully
    0.82
     passionately
    0.82
    arians
    0.78
     loudly
    0.77
    Act Density 0.035%

    No Known Activations