INDEX
Explanations
words related to setting examples or being a precedent
phrases that discuss setting precedents or models for future actions and policies
New Auto-Interp
Negative Logits
aeus
-0.81
女
-0.73
igree
-0.68
omes
-0.68
ovych
-0.67
ifact
-0.65
heart
-0.65
leep
-0.64
Includes
-0.63
ettes
-0.62
POSITIVE LOGITS
future
1.55
subsequent
1.35
others
1.14
eventual
1.08
aspiring
1.05
upcoming
0.99
other
0.98
further
0.97
later
0.96
broader
0.95
Activations Density 0.299%