INDEX
Explanations
economic and policy-related phrases, emphasizing the concept of achieving specific goals through action
New Auto-Interp
Negative Logits
estate
-0.78
here
-0.75
'/
-0.70
sburg
-0.68
allery
-0.66
reality
-0.65
Aren
-0.65
wagon
-0.65
nesday
-0.64
pse
-0.64
POSITIVE LOGITS
caveat
1.19
caveats
1.06
exception
1.01
intention
1.01
twist
0.99
backing
0.94
bang
0.90
flick
0.89
accompanying
0.88
hindsight
0.87
Activations Density 2.908%