INDEX
Explanations
phrases indicating authority and responsibilities
New Auto-Interp
Negative Logits
dden
-0.17
argo
-0.16
å¿į
-0.14
Curse
-0.14
deo
-0.14
roz
-0.14
kre
-0.13
mit
-0.13
ạc
-0.13
pent
-0.13
POSITIVE LOGITS
freedom
0.25
discretion
0.24
responsibility
0.23
latitude
0.22
option
0.22
flexibility
0.22
options
0.21
Freedom
0.21
authority
0.20
liberty
0.20
Activations Density 0.113%