INDEX
    Explanations

    names or pronouns referring to specific individuals

    references to specific individuals and their actions or statements

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.82
    animate
    -0.64
     murderer
    -0.64
    Load
    -0.62
    UTE
    -0.62
     lifes
    -0.61
    «ĺ
    -0.60
     transformative
    -0.60
     Nirvana
    -0.60
    ¾
    -0.59
    POSITIVE LOGITS
     intends
    1.65
     expects
    1.63
     hopes
    1.55
     wants
    1.49
     anticip
    1.46
     insists
    1.46
     believes
    1.45
     prefers
    1.43
     vows
    1.37
     proposes
    1.30
    Act Density 0.426%

    No Known Activations