INDEX
Explanations
phrases indicating victories or successes related to policy and legal outcomes
New Auto-Interp
Negative Logits
oppable
-0.14
iffs
-0.14
bal
-0.14
ÙħÙħ
-0.13
vur
-0.13
Beaver
-0.13
ãĤĥ
-0.13
_ONCE
-0.13
fcn
-0.13
.opts
-0.13
POSITIVE LOGITS
victory
0.46
victories
0.39
Victory
0.35
triumph
0.34
win
0.33
achievement
0.32
accomplishment
0.30
wins
0.29
successes
0.27
èĥľ
0.26
Activations Density 0.184%