INDEX
    Explanations

    phrases indicating potential actions, recommendations, or considerations

    instances of the phrase "we" and its variants indicating collective thoughts or actions

    New Auto-Interp
    Negative Logits
    amaz
    -0.62
    Rank
    -0.60
    dylib
    -0.58
    bats
    -0.58
    rams
    -0.56
    shows
    -0.54
     Cheong
    -0.54
     satell
    -0.54
     WTC
    -0.54
     Ups
    -0.53
    POSITIVE LOGITS
     sorely
    1.13
     dearly
    0.90
     aspire
    0.87
     gladly
    0.86
     dreamed
    0.85
     envy
    0.81
     wont
    0.79
     desperately
    0.79
     vehemently
    0.78
     hotly
    0.78
    Act Density 0.155%

    No Known Activations