INDEX
    Explanations

    expressions of disappointment

    expressions of disappointment

    New Auto-Interp
    Negative Logits
    ilic
    -0.72
    ilian
    -0.72
    ittee
    -0.71
    alach
    -0.70
    hens
    -0.70
    skirts
    -0.70
    running
    -0.69
    ossession
    -0.67
    ioch
    -0.67
     livest
    -0.64
    POSITIVE LOGITS
    actory
    0.86
     disappoint
    0.84
     disappointment
    0.83
    imaru
    0.80
     disappointed
    0.75
     loser
    0.72
    ments
    0.71
    ingly
    0.67
     losers
    0.66
    fully
    0.63
    Act Density 0.047%

    No Known Activations