INDEX
    Explanations

    expressive or exaggerated language related to dissatisfaction

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.07
    2:0.07
    3:0.07
    4:0.09
    5:0.07
    6:0.07
    7:0.09
    8:0.09
    9:0.08
    10:0.09
    11:0.08
    Negative Logits
    apply
    -1.88
     Located
    -1.79
    xtap
    -1.78
    ographs
    -1.68
    request
    -1.63
    olin
    -1.62
    ombs
    -1.60
    ogens
    -1.59
    mercial
    -1.59
    ograph
    -1.59
    POSITIVE LOGITS
     Cyr
    1.70
     Nile
    1.54
     Ferrari
    1.50
     polyg
    1.49
     meanwhile
    1.49
     dearly
    1.48
     Tsukuyomi
    1.48
     Fiat
    1.47
     electrom
    1.46
     regress
    1.46
    Act Density 0.000%

    No Known Activations