INDEX
Explanations
phrases indicating a collection or a set of related items
instances of the word "These," indicating a focus on referring to groups or collections of items or people
New Auto-Interp
Negative Logits
planning
-0.70
wound
-0.69
ticket
-0.63
reflex
-0.63
status
-0.63
finished
-0.62
tele
-0.62
manager
-0.62
board
-0.62
sacked
-0.61
POSITIVE LOGITS
These
3.06
these
2.31
These
2.18
Those
1.88
THESE
1.86
Such
1.70
This
1.64
Each
1.52
They
1.51
Both
1.42
Activations Density 0.017%