INDEX
    Explanations

    tokens that are often empty or denote spacing in text

    Code, math, or technical expressions

    New Auto-Interp
    Negative Logits
     Baillargeon
    -0.84
     للمعارف
    -0.78
     pleaſure
    -0.76
     iſt
    -0.73
     inflater
    -0.72
     myſelf
    -0.72
    addContainerGap
    -0.71
     leaſt
    -0.70
    horabuena
    -0.69
     ecosport
    -0.68
    POSITIVE LOGITS
    featureID
    0.61
    0.56
    لينكات
    0.55
    ↵↵
    0.49
    ResumeLayout
    0.46
    0.46
    unggu
    0.45
     /\.
    0.44
     <<<<<<<<<<<<<<
    0.44
    enoch
    0.43
    Act Density 0.136%

    No Known Activations