INDEX
Explanations
phrases related to establishing and influencing the direction or mood
phrases related to setting standards or precedents
New Auto-Interp
Negative Logits
leness
-0.70
ividual
-0.69
¬¼
-0.67
outweigh
-0.67
vernment
-0.62
ugg
-0.60
outwe
-0.59
derive
-0.58
jug
-0.58
enza
-0.58
POSITIVE LOGITS
precedent
1.17
tle
1.05
tone
1.02
sights
1.00
flame
0.97
precedence
0.97
preced
0.97
abl
0.92
groundwork
0.90
pace
0.86
Activations Density 0.062%