INDEX
Explanations
phrases related to positivity
New Auto-Interp
Negative Logits
loo
-0.88
Brilliant
-0.73
HAEL
-0.73
opsy
-0.70
Mellon
-0.68
wine
-0.68
tracks
-0.64
spo
-0.63
Lucia
-0.63
Leod
-0.63
POSITIVE LOGITS
itional
1.60
itions
1.43
itivity
1.31
itionally
1.22
idon
1.16
itor
1.15
ited
1.09
itors
1.07
itiveness
1.06
icion
1.06
Activations Density 0.061%