INDEX
Explanations
inputs that are entirely neutral or fact-based with no emotional or opinion-based content
New Auto-Interp
Negative Logits
تضيفلها
-0.94
ⓧ
-0.88
twimg
-0.83
WriteTagHelper
-0.81
AddTagHelper
-0.78
EDEFAULT
-0.76
#+#
-0.72
Gambas
-0.70
édie
-0.70
OrWhiteSpace
-0.69
POSITIVE LOGITS
[])
0.58
}');
0.57
]<<"
0.56
]();
0.53
')";
0.50
})();
0.48
--){0.48
[]),
0.47
__':
0.47
(".");0.47
Activations Density 0.072%