INDEX
Explanations
phrases related to policies, actions, or decisions
phrases that use the word "has" to denote possession or a state of being
New Auto-Interp
Negative Logits
eem
-0.67
observing
-0.60
behold
-0.60
atically
-0.59
bart
-0.59
Penal
-0.59
typing
-0.58
Model
-0.56
Interested
-0.56
writing
-0.55
POSITIVE LOGITS
been
1.45
undergone
1.19
become
1.16
begun
1.13
arisen
1.10
been
1.09
gotten
1.02
risen
1.00
Been
0.96
gone
0.96
Activations Density 0.261%