INDEX
Explanations
occurrences of the character 'u'
New Auto-Interp
Negative Logits
oa̍t
-1.14
autorytatywna
-1.13
expandindo
-1.13
Paglinawan
-1.13
>=",
-1.10
webElementXpaths
-1.10
contentLoaded
-1.02
setVerticalGroup
-1.01
مرئيه
-1.00
Vidite
-0.99
POSITIVE LOGITS
u
1.50
U
0.69
u
0.63
у
0.57
uD
0.50
0.49
U
0.49
&#
0.48
[toxicity=0]
0.48
-
0.46
Activations Density 0.087%