INDEX
Explanations
explanations or statements within text
phrases or statements about predictions and intentions
New Auto-Interp
Negative Logits
Specifically
-0.75
Suggest
-0.74
Analy
-0.72
Indeed
-0.71
DIT
-0.71
suggest
-0.71
Recommend
-0.68
phasis
-0.68
explan
-0.67
Materials
-0.67
POSITIVE LOGITS
immortality
1.09
Elvis
0.96
Adolf
0.93
thood
0.92
afterlife
0.89
Kobe
0.88
Jesus
0.88
Oprah
0.88
Peyton
0.87
heaven
0.87
Activations Density 0.587%