INDEX
    Explanations

    colons in code and text

    New Auto-Interp
    Negative Logits
    0.51
    .'),
    0.49
    0.48
    Fonte
    0.48
    .•
    0.47
    yd
    0.47
    0.46
    .
    0.46
    !”.
    0.44
    […]
    0.44
    POSITIVE LOGITS
     You
    0.65
     I
    0.64
     Yes
    0.63
    0.62
     This
    0.61
     {
    0.61
     Your
    0.61
     siehe
    0.60
     A
    0.59
     What
    0.59
    Act Density 0.072%

    No Known Activations