INDEX
    Explanations

    ordinal indicators like "first," "second," and "third."

    New Auto-Interp
    Negative Logits
    ']").
    -0.76
    '],$
    -0.73
    ]');
    -0.71
    "");
    -0.70
    ]]
    
    -0.69
    ']):
    -0.69
    ))^{
    -0.68
    ]");
    -0.68
    er
    -0.68
    ]").
    -0.65
    POSITIVE LOGITS
    th
    2.41
    TH
    1.55
    ths
    1.18
    teenth
    1.15
     ninth
    1.14
     th
    1.13
     eighth
    1.12
     seventh
    1.09
     tenth
    1.08
    Th
    1.07
    Act Density 0.058%

    No Known Activations