INDEX
    Explanations

    phrases related to establishing norms or examples

    phrases relating to setting standards, examples, or precedents

    New Auto-Interp
    Negative Logits
    ividual
    -0.69
    ugg
    -0.66
    orge
    -0.66
    leness
    -0.65
    jug
    -0.63
     Sunder
    -0.62
    jj
    -0.59
     compr
    -0.59
     outweigh
    -0.59
    ér
    -0.59
    POSITIVE LOGITS
     precedent
    1.28
     tone
    1.11
     precedence
    1.03
    tle
    1.00
     benchmark
    0.99
    flame
    0.98
     example
    0.94
     benchmarks
    0.94
     preced
    0.93
     record
    0.93
    Act Density 0.061%

    No Known Activations