INDEX
    Explanations

    words indicating numerical data or performance statistics

    New Auto-Interp
    Negative Logits
    .scalablytyped
    -0.16
    _Lean
    -0.16
    jr
    -0.15
    _Tis
    -0.15
    dera
    -0.14
    erais
    -0.14
    çIJ³
    -0.14
    /Framework
    -0.14
    enso
    -0.14
    ANJI
    -0.14
    POSITIVE LOGITS
     Multiple
    0.16
     multiple
    0.16
     else
    0.15
     Sur
    0.15
    Multiple
    0.15
     Else
    0.15
    ddit
    0.15
    ød
    0.14
    Else
    0.14
     elsewhere
    0.14
    Act Density 0.002%

    No Known Activations