INDEX
    Explanations

    comparisons or contrasts in a text

    New Auto-Interp
    Negative Logits
    )",
    -0.75
    :[
    -0.69
    Scroll
    -0.69
    ilater
    -0.69
    :-
    -0.68
    :#
    -0.68
    rador
    -0.65
    :]
    -0.65
    :,
    -0.63
    ":["
    -0.61
    POSITIVE LOGITS
     goddamn
    0.86
    enegger
    0.69
     equals
    0.66
     fucking
    0.66
    BILITY
    0.65
     damn
    0.62
     âī¡
    0.62
     sucks
    0.60
     suck
    0.60
     godd
    0.60
    Act Density 0.521%

    No Known Activations