INDEX
    Explanations

    phrases related to pride and self-promotion

    New Auto-Interp
    Negative Logits
    sene
    -0.52
    lear
    -0.50
     Kugel
    -0.48
    olge
    -0.47
     Energies
    -0.46
     dise
    -0.46
    nex
    -0.46
    σουμε
    -0.46
     acute
    -0.45
    ."\
    -0.45
    POSITIVE LOGITS
     boast
    1.43
     boasted
    1.34
     bragging
    1.32
     boasting
    1.29
     brag
    1.15
     boasts
    1.15
     proud
    1.09
    proud
    1.09
     proudly
    1.03
     stolz
    1.03
    Act Density 0.317%

    No Known Activations