INDEX
    Explanations

    phrases describing various actions or events happening

    instances of existence or presence in statements

    New Auto-Interp
    Negative Logits
     Pants
    -0.67
     TRUMP
    -0.64
    ingham
    -0.62
    abases
    -0.61
    essen
    -0.59
    asketball
    -0.58
     RG
    -0.57
    emate
    -0.57
    equality
    -0.56
    cha
    -0.56
    POSITIVE LOGITS
     wont
    1.04
     evidenced
    0.83
     attest
    0.74
     often
    0.71
     [|
    0.69
    çͰ
    0.68
    ãĥĩãĤ£
    0.67
     tremend
    0.67
     actionGroup
    0.67
     previously
    0.67
    Act Density 0.220%

    No Known Activations