INDEX
Explanations
words related to policies, regulations, or laws
key terms related to regulations, policies, and governance
New Auto-Interp
Negative Logits
tis
-0.72
laughs
-0.71
else
-0.62
likes
-0.60
Quit
-0.56
wise
-0.56
Else
-0.54
needs
-0.54
thinks
-0.54
Recomm
-0.54
POSITIVE LOGITS
include
1.31
were
1.30
are
1.29
contain
1.26
represent
1.22
aren
1.21
comprise
1.20
involve
1.20
weren
1.20
consist
1.18
Activations Density 0.304%