INDEX
Explanations
negative sentiment or expressions indicating decline or loss
numbers followed by units or quantities
New Auto-Interp
Negative Logits
zelve
-0.61
שוליים
-0.61
Infór
-0.59
desmotivaciones
-0.58
Gerechtigkeit
-0.58
printStackTrace
-0.56
InputDecoration
-0.56
GoogleFonts
-0.54
éstos
-0.54
pérd
-0.53
POSITIVE LOGITS
-
0.97
–
0.68
-
0.67
%-
0.64
()-
0.62
]-
0.61
-<
0.60
‐
0.60
)-
0.60
‑
0.59
Activations Density 0.032%