INDEX
    Explanations

    phrases expressing strong emotions or reactions

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.09
    3:0.04
    4:0.11
    5:0.15
    6:0.25
    7:0.02
    8:0.12
    9:0.04
    10:0.05
    11:0.04
    Negative Logits
     affinity
    -1.45
     corpus
    -1.45
     geography
    -1.43
     jurisdiction
    -1.41
    racuse
    -1.35
     offence
    -1.35
     enrol
    -1.34
    assetsadobe
    -1.33
    displayText
    -1.33
     geographic
    -1.33
    POSITIVE LOGITS
    !!!!
    1.66
     Respect
    1.56
    laughs
    1.56
    !!
    1.50
    Mods
    1.47
    !!!!!
    1.46
    1.45
     Laugh
    1.43
    !!!
    1.42
    Laughs
    1.41
    Act Density 0.006%

    No Known Activations