INDEX
    Explanations

    instances where the phrase "to be fair" is used in the text

    phrases that emphasize fairness and clarity in discussion

    New Auto-Interp
    Negative Logits
     surf
    -0.71
    ende
    -0.69
     hemor
    -0.69
    bern
    -0.68
     decom
    -0.65
     dissolve
    -0.62
    arnaev
    -0.61
     med
    -0.61
    jar
    -0.61
    edin
    -0.60
    POSITIVE LOGITS
    oops
    0.77
     Philippe
    0.71
    entious
    0.71
    uracy
    0.70
    ayson
    0.63
    ohn
    0.63
    raper
    0.63
    sonian
    0.62
     Obj
    0.62
    ensional
    0.61
    Act Density 0.031%

    No Known Activations