INDEX
Explanations
phrases indicating the start of a list or section in a document
references to structured formats or sections within a document
New Auto-Interp
Negative Logits
work
-0.63
thinking
-0.62
culture
-0.61
unit
-0.61
commitment
-0.61
doctrine
-0.61
tant
-0.60
party
-0.59
capacity
-0.59
stop
-0.58
POSITIVE LOGITS
Below
3.56
Below
2.41
Above
1.96
below
1.82
BELOW
1.77
Here
1.54
Above
1.47
below
1.40
above
1.33
Beyond
1.25
Activations Density 0.011%