INDEX
    Explanations

    phrases indicating measurements or descriptors related to intensity or levels

    New Auto-Interp
    Negative Logits
     middle
    -0.78
    middle
    -0.74
     MIDDLE
    -0.71
    中间
    -0.69
    Middle
    -0.69
     intermediate
    -0.67
    intermediate
    -0.67
     mittleren
    -0.65
     mid
    -0.65
    Intermediate
    -0.64
    POSITIVE LOGITS
     late
    0.57
     Late
    0.52
    Late
    0.47
     high
    0.46
     end
    0.45
     고
    0.42
     High
    0.41
     LATE
    0.41
     advanced
    0.41
    late
    0.40
    Act Density 0.042%

    No Known Activations