INDEX
    Explanations

    phrases indicating proportions or fractions

    New Auto-Interp
    Negative Logits
    enko
    -0.15
    999
    -0.15
    riad
    -0.15
    inic
    -0.14
    enk
    -0.14
    aley
    -0.14
    arias
    -0.13
    rof
    -0.13
    ½
    -0.13
    ký
    -0.13
    POSITIVE LOGITS
     third
    0.81
    third
    0.73
     THIRD
    0.66
    -third
    0.65
    Third
    0.64
     Third
    0.63
     fifth
    0.62
    第ä¸ī
    0.61
     thirds
    0.60
    _third
    0.57
    Act Density 0.091%

    No Known Activations