INDEX
    Explanations

    programming-related syntax and structure elements

    New Auto-Interp
    Negative Logits
    TestingModule
    -0.70
    #+#
    -0.63
     COUNTER
    -0.59
    trag
    -0.54
     Accepted
    -0.53
     Accept
    -0.53
    iscus
    -0.52
     defaultstate
    -0.52
    imdi
    -0.52
    currentColor
    -0.51
    POSITIVE LOGITS
     صوتيه
    0.79
    GOTREF
    0.65
    Hochspringen
    0.64
     typelib
    0.63
     تضيفلها
    0.58
     betweenstory
    0.57
    
    0.56
    [toxicity=0]
    0.55
     nonUne
    0.54
    AddTagHelper
    0.54
    Act Density 0.618%

    No Known Activations