INDEX
Explanations
references to publications or sources within a sentence
occurrences of opening parentheses in the text
New Auto-Interp
Negative Logits
prey
-0.83
incons
-0.76
inward
-0.76
hydrogen
-0.73
lag
-0.72
clin
-0.72
stem
-0.71
pale
-0.71
generation
-0.70
synt
-0.70
POSITIVE LOGITS
which
1.79
pictured
1.77
formerly
1.75
although
1.62
whose
1.62
now
1.62
see
1.61
later
1.59
among
1.57
1.56
Activations Density 0.150%