INDEX
Explanations
strong, action-oriented language related to pushing boundaries, limits, and envelopes
phrases related to pushing boundaries or limits
New Auto-Interp
Negative Logits
efe
-0.67
SEA
-0.65
cin
-0.64
nces
-0.63
paying
-0.62
ottest
-0.61
cence
-0.61
cise
-0.60
/-
-0.60
consulted
-0.60
POSITIVE LOGITS
boundaries
1.48
envelope
1.41
buttons
1.22
limits
1.21
button
1.12
onward
0.97
Limits
0.96
harder
0.95
wedge
0.95
boundary
0.90
Activations Density 0.214%