INDEX
Explanations
mentions of "straw" and related terms
the presence of separators or markers, particularly related to "straw."
New Auto-Interp
Negative Logits
ervation
-0.76
olon
-0.74
apse
-0.72
ogue
-0.71
notor
-0.70
ynt
-0.69
ccording
-0.68
olitan
-0.68
ECD
-0.67
itals
-0.65
POSITIVE LOGITS
straw
1.10
backs
0.95
weight
0.93
weights
0.88
sided
0.88
mere
0.86
stad
0.85
leaf
0.84
pipe
0.84
bridge
0.82
Activations Density 0.035%