INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     initiating
    -0.77
     仲
    -0.73
    альные
    -0.73
    onents
    -0.73
    Critique
    -0.71
     during
    -0.70
    -0.69
    メニューは
    -0.68
    atex
    -0.68
     since
    -0.68
    POSITIVE LOGITS
     flags
    1.25
     condition
    1.20
    Flags
    1.19
     Carry
    1.17
     Zero
    1.16
     flag
    1.16
    Zero
    1.16
    Carry
    1.15
     carry
    1.15
    Condition
    1.13
    Act Density 0.022%

    No Known Activations