INDEX
    Explanations

    occurrences of code expressions related to conditional checks

    New Auto-Interp
    Negative Logits
     and
    -0.60
    2
    -0.53
    [toxicity=0]
    -0.52
    5
    -0.51
    ↵↵
    -0.50
    C
    -0.50
    and
    -0.50
    3
    -0.50
     x
    -0.50
     values
    -0.49
    POSITIVE LOGITS
     (!
    1.25
    AccessorTable
    1.20
    (!__
    1.18
    InjectAttribute
    1.11
    (!
    1.07
    ]--;
    1.07
     AssemblyCulture
    1.02
    =!
    1.00
    CloseOperation
    1.00
    +#+#
    1.00
    Act Density 0.024%

    No Known Activations