INDEX
    Explanations

    expressions of confusion or inquiry regarding various topics

    New Auto-Interp
    Negative Logits
     neither
    -0.19
     doesn
    -0.17
     didn
    -0.16
    éra
    -0.14
     never
    -0.13
     only
    -0.13
     hasn
    -0.13
    ileÅŁ
    -0.13
    uffling
    -0.13
    undan
    -0.13
    POSITIVE LOGITS
     Not
    1.03
    Not
    0.96
    not
    0.85
    -not
    0.81
    _not
    0.81
     not
    0.79
     notch
    0.78
     NOT
    0.73
    .not
    0.71
    _Not
    0.70
    Act Density 0.316%

    No Known Activations