INDEX
    Explanations

    instances where the word "since" is used in sentences

    phrases indicating a causal relationship or temporal markers in a discussion

    New Auto-Interp
    Negative Logits
     pione
    -0.84
     exting
    -0.79
    vantage
    -0.77
    ilan
    -0.77
    displayText
    -0.77
    pec
    -0.74
    ā
    -0.74
     RandomRedditor
    -0.73
    Ď
    -0.73
    û
    -0.73
    POSITIVE LOGITS
     they
    1.09
     many
    1.06
     there
    1.04
     most
    1.03
     neither
    1.02
     nobody
    1.01
     it
    0.97
     we
    0.92
     none
    0.89
    rely
    0.85
    Act Density 0.180%

    No Known Activations