INDEX
Explanations
sentences that give advice or instructions
sentences that provide insights or recommendations
New Auto-Interp
Negative Logits
sovereignty
-0.78
millenn
-0.77
unmanned
-0.74
birthplace
-0.73
disemb
-0.73
alleged
-0.73
unbeaten
-0.72
savage
-0.71
waged
-0.71
unaccount
-0.71
POSITIVE LOGITS
Alternatively
1.60
Depending
1.59
Especially
1.56
Otherwise
1.49
Usually
1.49
Besides
1.45
Likewise
1.45
Luckily
1.45
Ideally
1.44
Fortunately
1.42
Activations Density 0.460%