INDEX
    Explanations

    statements or assertions

    instances of the phrase "the fact that."

    New Auto-Interp
    Negative Logits
    aukee
    -0.77
    Si
    -0.72
    pec
    -0.70
    vc
    -0.69
    uttering
    -0.66
    ocking
    -0.64
    yn
    -0.64
    ately
    -0.63
    Eye
    -0.63
    wn
    -0.63
    POSITIVE LOGITS
     someone
    0.96
     they
    0.94
     nobody
    0.87
     there
    0.84
     somebody
    0.84
     hindsight
    0.83
     we
    0.81
     everyone
    0.79
     anyone
    0.76
     humans
    0.76
    Act Density 0.084%

    No Known Activations