INDEX
Explanations
the presence of specific non-alphanumeric tokens or formatting elements in the text
New Auto-Interp
Negative Logits
e
-0.85
ed
-0.82
Schroeder
-0.81
ORE
-0.81
Tind
-0.80
помним
-0.78
attro
-0.75
getvalue
-0.75
Bess
-0.73
Mendes
-0.73
POSITIVE LOGITS
()))
1.60
']))
1.59
]))
1.50
())
1.41
))
1.39
'))
1.39
])
1.38
++)
1.38
)))
1.38
__':
1.37
Activations Density 0.033%