INDEX
    Explanations

    code-related components and structures, particularly those associated with models and APIs

    New Auto-Interp
    Negative Logits
    oneofs
    -0.84
    windowFixed
    -0.83
     تضيفلها
    -0.80
    featureID
    -0.79
    issory
    -0.78
    SharedDtor
    -0.78
    WriteTagHelper
    -0.77
     CreateTagHelper
    -0.76
     typelib
    -0.75
     Paglinawan
    -0.75
    POSITIVE LOGITS
    ές
    0.49
    ,:),
    0.47
     Marston
    0.43
    ','',
    0.42
    [toxicity=0]
    0.42
    ↵↵↵
    0.42
    yip
    0.42
    WithString
    0.42
    door
    0.42
    やる
    0.42
    Act Density 0.080%

    No Known Activations