INDEX
Explanations
phrases expressing positive sentiment or approval
expressions of enthusiasm or positivity
New Auto-Interp
Negative Logits
mitt
-0.68
ilus
-0.67
canon
-0.66
clips
-0.64
former
-0.62
ethyl
-0.62
ople
-0.62
pex
-0.62
missions
-0.62
eter
-0.62
POSITIVE LOGITS
sword
0.91
opportunity
0.89
strides
0.82
deal
0.81
outdoors
0.81
fun
0.80
idea
0.79
tasting
0.76
insight
0.76
shield
0.73
Activations Density 0.068%