INDEX
Explanations
references to instructions or steps in a list
occurrences of parentheses in the text
New Auto-Interp
Negative Logits
confront
-0.79
spir
-0.79
constitu
-0.72
prey
-0.71
clin
-0.70
grip
-0.70
gang
-0.69
detained
-0.69
confronted
-0.68
aver
-0.68
POSITIVE LOGITS
excluding
1.55
optional
1.50
depending
1.45
which
1.44
assuming
1.42
except
1.41
unless
1.41
usually
1.41
including
1.39
possibly
1.38
Activations Density 0.132%