INDEX
Explanations
warnings or prohibitive language in text
New Auto-Interp
Negative Logits
"):
-0.77
__":
-0.74
itemBuilder
-0.74
WebServlet
-0.72
"],
-0.71
"),
-0.71
"){
-0.70
")){
-0.69
"""
-0.69
zarchiwizowane
-0.68
POSITIVE LOGITS
!
0.76
!!!
0.65
!!
0.60
!!!!
0.57
!.
0.57
;
0.56
ー
0.55
Билгалдахарш
0.55
poveznice
0.54
!!!!!
0.53
Activations Density 0.895%