INDEX
    Explanations

    mentions of significant or high-impact events

    New Auto-Interp
    Negative Logits
     which
    -0.28
    ,
    -0.25
     otherwise
    -0.25
     or
    -0.22
    which
    -0.22
    otherwise
    -0.20
     thereby
    -0.20
     and
    -0.20
     thus
    -0.20
     but
    -0.19
    POSITIVE LOGITS
     there
    0.28
     it
    0.22
    there
    0.22
     if
    0.21
     they
    0.20
    çͱäºİ
    0.20
     we
    0.20
    if
    0.20
    they
    0.19
    we
    0.19
    Act Density 0.476%

    No Known Activations