INDEX
    Explanations

    punctuation marks indicating pauses or separations in thought

    special characters and punctuation

    New Auto-Interp
    Negative Logits
    تقاوى
    -0.64
    InjectAttribute
    -0.63
    '];?>
    -0.61
    ."]
    -0.60
    ...");
    -0.60
     unſ
    -0.60
     :");
    -0.59
     ."
    -0.59
    ']?>
    -0.58
    ...")
    -0.57
    POSITIVE LOGITS
    ,
    0.81
    ,<
    0.62
    %,
    0.60
    0.59
    $,
    0.58
    ،
    0.57
    #,
    0.55
     \%,
    0.54
    \%,
    0.54
     %,
    0.52
    Act Density 2.748%

    No Known Activations