INDEX
Explanations
actions reflecting progress and improvements in various contexts
New Auto-Interp
Negative Logits
able
-0.15
Holt
-0.15
147
-0.15
still
-0.15
ing
-0.14
au
-0.14
soon
-0.14
ig
-0.14
Å
-0.14
rah
-0.14
POSITIVE LOGITS
lately
0.33
since
0.27
recently
0.27
Recently
0.22
since
0.22
Recently
0.22
_since
0.21
ness
0.20
recent
0.19
/shared
0.18
Activations Density 0.399%