INDEX
Explanations
positive sentiments or praise
expressions of high praise or positive sentiment
New Auto-Interp
Negative Logits
ople
-0.98
eter
-0.77
cling
-0.74
ilus
-0.73
bus
-0.72
SPONSORED
-0.71
former
-0.69
eters
-0.69
idon
-0.66
©¶æ
-0.65
POSITIVE LOGITS
sword
0.89
strides
0.89
introductory
0.79
opportunity
0.78
ãĥ¤
0.73
offence
0.73
enough
0.72
accomplishment
0.71
deal
0.71
teamwork
0.71
Activations Density 0.034%