INDEX
Explanations
phrases related to advocating for actions to be taken, often with a sense of urgency and importance
references to consistency or repetition of actions
New Auto-Interp
Negative Logits
Leilan
-0.69
ospons
-0.66
urated
-0.65
Provides
-0.64
NULL
-0.63
andise
-0.62
osponsors
-0.62
opened
-0.61
usalem
-0.59
quished
-0.59
POSITIVE LOGITS
same
1.39
unthinkable
1.27
same
1.17
opposite
1.08
simplest
1.08
oret
1.03
exact
1.01
utmost
1.01
hardest
0.96
groundwork
0.95
Activations Density 0.075%