INDEX
    Explanations

    words related to the legal system, experiments, politeness and being on time.

    New Auto-Interp
    Negative Logits
     even
    -4.13
    even
    -3.81
    Even
    -3.47
     Even
    -3.44
     EVEN
    -3.31
    EVEN
    -2.92
     даже
    -2.92
     навіть
    -2.86
     incluso
    -2.78
     حتی
    -2.61
    POSITIVE LOGITS
    mergeFrom
    0.57
    mbggenerated
    0.57
    而已
    0.57
     distanciation
    0.55
    styleType
    0.54
     oprot
    0.54
     незавершена
    0.54
    ImageContext
    0.54
    Билгалдахарш
    0.53
    parsedMessage
    0.53
    Act Density 8.904%

    No Known Activations