INDEX
Explanations
phrases indicating a contrast or division within a group
New Auto-Interp
Negative Logits
Saying
-0.71
Loading
-0.70
OOL
-0.70
Recap
-0.68
disclaimer
-0.67
package
-0.65
arter
-0.64
Quote
-0.63
Correction
-0.60
opener
-0.60
POSITIVE LOGITS
been
1.34
arisen
1.18
hitherto
1.11
risen
1.07
been
1.06
become
1.02
existed
1.01
resided
0.99
fallen
0.99
gotten
0.97
Activations Density 0.151%