INDEX
    Explanations

    instances of the word "sorry" and variations in expressions of regret

    New Auto-Interp
    Negative Logits
    ets
    -0.17
    erne
    -0.16
    alo
    -0.16
    847
    -0.15
     Alb
    -0.14
    maya
    -0.14
    elda
    -0.13
    arra
    -0.13
    ery
    -0.13
     possibilities
    -0.13
    POSITIVE LOGITS
    apat
    0.18
    kus
    0.17
    æĻ´
    0.16
    isser
    0.15
    LATED
    0.15
     about
    0.15
    omba
    0.15
    sorry
    0.15
    ylon
    0.14
    achel
    0.14
    Act Density 0.013%

    No Known Activations