INDEX
    Explanations

    instances of comparisons or conditions related to numbers and quantities

    New Auto-Interp
    Negative Logits
     Second
    -0.20
     SECOND
    -0.17
     вÑĤоÑĢ
    -0.17
     اÙĦثاÙĨÙĬ
    -0.17
    第äºĮ
    -0.17
     第äºĮ
    -0.16
    Second
    -0.16
     Secondly
    -0.16
    SECOND
    -0.16
    second
    -0.16
    POSITIVE LOGITS
    3
    0.43
     third
    0.43
     thirds
    0.37
     Third
    0.36
    third
    0.35
     ÑĤÑĢеÑĤ
    0.34
    -third
    0.33
    Third
    0.32
     THIRD
    0.32
    ï¼ĵ
    0.32
    Act Density 0.128%

    No Known Activations