INDEX
Explanations
sentences expressing appreciation or recognition
expressions of appreciation and support
New Auto-Interp
Negative Logits
filler
-0.79
tack
-0.79
inverse
-0.77
magnetic
-0.74
elusive
-0.74
phantom
-0.73
interpol
-0.73
spinning
-0.73
pir
-0.72
unexpl
-0.72
POSITIVE LOGITS
Thank
1.44
Therefore
1.16
Tonight
1.12
Govern
1.10
Chair
1.04
Please
1.02
Today
1.01
Article
0.99
Upon
0.98
Peace
0.97
Activations Density 0.410%