INDEX
Explanations
phrases related to political promises and their fulfillment or lack thereof
New Auto-Interp
Negative Logits
بط
-0.15
AIT
-0.15
orz
-0.15
ebi
-0.14
Paused
-0.14
asti
-0.14
/pass
-0.14
697
-0.14
Forgery
-0.14
arga
-0.14
POSITIVE LOGITS
promise
1.05
promises
0.96
Promise
0.88
promise
0.82
promised
0.78
Promise
0.77
prom
0.68
Prom
0.67
PROM
0.67
promising
0.65
Activations Density 0.310%