INDEX
Explanations
positive words or expressions
expressions of positivity and positive sentiment
New Auto-Interp
Negative Logits
loo
-0.84
Brilliant
-0.71
HAEL
-0.70
opsy
-0.70
Leod
-0.68
tracks
-0.68
wine
-0.66
Hearts
-0.63
Recall
-0.62
stall
-0.62
POSITIVE LOGITS
itional
1.51
itions
1.34
itivity
1.26
idon
1.15
itionally
1.13
itory
1.10
itiveness
1.07
itor
1.06
ited
1.04
icion
1.03
Activations Density 0.029%