INDEX
    Explanations

    the presence of symbols or characters related to programming syntax

    New Auto-Interp
    Negative Logits
    ')")
    -0.63
    Personensuche
    -0.62
    abestanden
    -0.60
    '");
    -0.58
    !")
    
    -0.55
    ']")
    -0.55
    ."));
    -0.53
    `).
    -0.51
    \"");
    -0.51
    ’).
    -0.50
    POSITIVE LOGITS
    =
    1.49
    =”
    1.01
    ="
    0.99
     =
    0.91
    ='
    0.87
    =_
    0.87
    =+
    0.85
    =-
    0.83
    =\
    0.82
    =/
    0.80
    Act Density 0.043%

    No Known Activations