INDEX
    Explanations

    variations of punctuation and formatting in text

    New Auto-Interp
    Negative Logits
    bootstrapcdn
    -0.77
     nawr
    -0.59
     nephe
    -0.58
     المعيارى
    -0.57
     nevertheless
    -0.56
     itſelf
    -0.55
     счита
    -0.54
     philosop
    -0.53
     oprot
    -0.53
    يكب
    -0.53
    POSITIVE LOGITS
    <bos>
    0.90
    ')}}">
    0.87
    __(/*!
    0.76
    ]").
    0.74
    "]);
    
    0.74
    "]));
    0.72
    ]]:
    0.72
    //
    0.72
    })$}
    0.71
    0.71
    Act Density 0.048%

    No Known Activations