INDEX
    Explanations

    phrases that provide reasons or explanations

    instances of the word "Because" indicating explanations or justifications

    New Auto-Interp
    Negative Logits
    shaw
    -0.78
    jet
    -0.76
    ns
    -0.74
    wn
    -0.74
    mint
    -0.73
    yan
    -0.71
    åĤ
    -0.71
    robe
    -0.71
    shr
    -0.68
    VR
    -0.67
    POSITIVE LOGITS
     fuck
    0.70
     beware
    0.69
     they
    0.66
    */(
    0.65
    ecause
    0.65
     Prosper
    0.64
     there
    0.64
    âĶģ
    0.64
    olini
    0.63
    elligence
    0.61
    Act Density 0.049%

    No Known Activations