INDEX
    Explanations

    references to various medical and legal terminologies and entities

    Text following a colon

    New Auto-Interp
    Negative Logits
    .
    -0.67
    .
    
    -0.65
    .\\
    -0.64
    ++.
    -0.61
    ".
    -0.58
    人了
    -0.57
    }}$.
    -0.57
    $.
    -0.56
    }$.
    -0.55
    %.
    -0.55
    POSITIVE LOGITS
    리는
    0.83
    들은
    0.79
    ният
    0.78
     is
    0.77
     has
    0.72
     inilah
    0.72
     inoltre
    0.71
    之所以
    0.71
     refers
    0.69
    에는
    0.68
    Act Density 1.755%

    No Known Activations