INDEX
    Explanations

    phrases related to updates or announcements

    the presence of the word "We" and its variations, indicating a focus on collective or inclusive statements

    New Auto-Interp
    Negative Logits
    forms
    -0.59
     guiActiveUnfocused
    -0.59
    panic
    -0.58
    REDACTED
    -0.58
     eviction
    -0.58
     flows
    -0.57
    Reply
    -0.56
     Leilan
    -0.56
    âĸ¬
    -0.55
     Flavoring
    -0.55
    POSITIVE LOGITS
    're
    1.17
    eks
    1.06
    athered
    1.05
    've
    1.03
    asel
    1.02
    igh
    1.02
    akening
    1.00
    ibo
    0.99
    arers
    0.99
    'll
    0.96
    Act Density 0.138%

    No Known Activations