INDEX
    Explanations

    numerical values in a specific range within the text

    phrases indicating ordinal positions or instances within a sequential list

    New Auto-Interp
    Negative Logits
    WER
    -0.79
    ertodd
    -0.69
    helle
    -0.64
    irection
    -0.64
     surpr
    -0.63
     condem
    -0.60
    heit
    -0.59
    overe
    -0.57
    cape
    -0.57
     disposed
    -0.56
    POSITIVE LOGITS
    arching
    0.62
    icial
    0.60
    frames
    0.60
    tan
    0.60
    third
    0.57
    their
    0.57
     course
    0.56
    major
    0.56
    fourth
    0.56
     thirds
    0.55
    Act Density 0.103%

    No Known Activations