INDEX
    Explanations

    punctuation marks, particularly commas and periods

    New Auto-Interp
    Negative Logits
    }";
    -0.64
    }');
    -0.60
    '");
    -0.59
    ')";
    -0.59
     ');
    -0.58
     ?";
    -0.58
    ]";
    -0.58
    ]');
    -0.57
     ";
    
    -0.56
    /');
    -0.56
    POSITIVE LOGITS
    ^(@)
    0.73
    ėk
    0.72
    KommentareTeilen
    0.72
    ropractic
    0.69
    :✨
    0.68
     kasarigan
    0.68
    
    0.67
    SuppressLint
    0.67
    TintMode
    0.66
    istory
    0.66
    Act Density 0.144%

    No Known Activations