INDEX
    Explanations

    inputs that are entirely neutral or fact-based with no emotional or opinion-based content

    New Auto-Interp
    Negative Logits
     تضيفلها
    -0.94
    -0.88
    twimg
    -0.83
    WriteTagHelper
    -0.81
    AddTagHelper
    -0.78
    EDEFAULT
    -0.76
    #+#
    -0.72
    Gambas
    -0.70
    édie
    -0.70
    OrWhiteSpace
    -0.69
    POSITIVE LOGITS
    [])
    
    0.58
    }');
    0.57
    ]<<"
    0.56
    ]();
    0.53
    ')";
    0.50
    })();
    
    0.48
    --){
    0.48
     []),
    0.47
    __':
    
    0.47
    (".");
    0.47
    Act Density 0.072%

    No Known Activations