INDEX
    Explanations

    instances of formatted mathematical expressions

    New Auto-Interp
    Negative Logits
     Bedür
    -0.64
     Italijanski
    -0.57
    opportunity
    -0.56
    whelming
    -0.56
     disambiguazione
    -0.56
     Photocase
    -0.55
    encounter
    -0.54
     ویکی‌پدی
    -0.53
     Krueger
    -0.53
    Tikang
    -0.52
    POSITIVE LOGITS
    text
    1.17
     text
    0.85
    Text
    0.82
     Text
    0.67
    TEXT
    0.66
    textbf
    0.64
    mbox
    0.63
    textit
    0.58
    textrm
    0.57
    \{\\
    0.57
    Act Density 0.130%

    No Known Activations