INDEX
    Explanations

    expressions of disappointment

    expressions of disappointment

    New Auto-Interp
    Negative Logits
    ittee
    -0.70
    running
    -0.69
    monary
    -0.69
    skirts
    -0.68
    xon
    -0.67
    llular
    -0.65
    ifa
    -0.65
    ahu
    -0.64
    ermanent
    -0.62
    uto
    -0.62
    POSITIVE LOGITS
     disappoint
    0.89
    actory
    0.81
     disappointment
    0.80
    ments
    0.78
    fully
    0.74
     omission
    0.73
    ingly
    0.73
     disappointed
    0.73
    ful
    0.72
     loser
    0.72
    Act Density 0.068%

    No Known Activations