INDEX
    Explanations

    expressions of willingness or readiness

    expressions of positive emotions or sentiments

    New Auto-Interp
    Negative Logits
    thumbnails
    -0.85
     Killer
    -0.68
     vocabulary
    -0.66
     GOODMAN
    -0.65
     causation
    -0.64
     verbs
    -0.64
     restraint
    -0.64
     causal
    -0.64
    onal
    -0.63
     specifics
    -0.63
    POSITIVE LOGITS
    ãĤ©
    0.91
     welcomed
    0.90
     embraced
    0.88
     accepted
    0.83
     endorse
    0.83
     greeted
    0.81
     reunited
    0.79
     endorsed
    0.79
     entertained
    0.77
     honoured
    0.76
    Act Density 0.071%

    No Known Activations