INDEX
Explanations
instances of negation or lack of evidence in statements
New Auto-Interp
Negative Logits
ſeveral
-0.85
tambi
-0.73
ſmall
-0.69
alſo
-0.68
autorytatywna
-0.68
zamiast
-0.67
сього
-0.67
ſome
-0.67
beſt
-0.66
ſtill
-0.66
POSITIVE LOGITS
any
2.42
anything
2.35
anymore
2.12
anything
2.04
nor
1.96
anywhere
1.91
any
1.83
Any
1.79
任何
1.79
Any
1.78
Activations Density 3.057%