INDEX
Explanations
sentences that convey advice or cautionary guidance
New Auto-Interp
Negative Logits
royalty
-0.75
birthplace
-0.74
descendants
-0.72
flagship
-0.70
namesake
-0.68
ancestry
-0.67
crashed
-0.67
crude
-0.67
boarded
-0.66
leaked
-0.66
POSITIVE LOGITS
Ideally
1.54
Otherwise
1.48
Include
1.30
Especially
1.29
Doing
1.28
Lastly
1.25
If
1.25
Then
1.25
Avoid
1.24
Remember
1.23
Activations Density 0.153%