INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    IE
    -0.08
    -0.07
    ////
    -0.07
    OH
    -0.07
     &
    -0.07
     assh
    -0.07
    Private
    -0.07
    -0.07
     Dawson
    -0.07
     magnificent
    -0.07
    POSITIVE LOGITS
     altså
    0.11
    。↵↵↵
    0.09
     alltså
    0.09
    —and
    0.09
    —with
    0.08
     unmistak
    0.08
    ;↵↵↵//
    0.08
     ©
    0.08
    .↵↵↵↵↵↵
    0.08
     whichever
    0.08
    Act Density 0.097%

    No Known Activations