INDEX
Explanations
words related to states of confusion or uncertainty
New Auto-Interp
Negative Logits
bye
-0.80
deviations
-0.77
Policies
-0.72
Equity
-0.72
equity
-0.70
Order
-0.68
guidelines
-0.67
Guidelines
-0.67
reconciliation
-0.63
roles
-0.63
POSITIVE LOGITS
ingly
1.40
ciating
1.05
ament
1.02
stru
1.00
iously
1.00
ient
0.92
ulous
0.92
ruciating
0.91
vic
0.91
amental
0.90
Activations Density 0.121%