INDEX
    Explanations

    instances of LaTeX formatting or mathematical symbols

    New Auto-Interp
    Negative Logits
     Cub
    -0.14
    urge
    -0.14
     Franc
    -0.14
    eson
    -0.14
    asje
    -0.14
    omet
    -0.14
     Dud
    -0.13
    ÑĩаÑĤ
    -0.13
     Puerto
    -0.13
     Surge
    -0.13
    POSITIVE LOGITS
    it
    0.25
    bf
    0.24
    tt
    0.21
    rm
    0.21
    em
    0.20
    foot
    0.19
    sc
    0.18
    sf
    0.18
    .sl
    0.17
     bf
    0.17
    Act Density 0.020%

    No Known Activations