INDEX
Explanations
recommendations given in text
recommendations and endorsements in the text
New Auto-Interp
Negative Logits
tal
-0.77
cod
-0.74
san
-0.71
Sus
-0.71
Alic
-0.68
sen
-0.67
vous
-0.67
von
-0.66
Ern
-0.65
kered
-0.65
POSITIVE LOGITS
recommending
1.10
recommend
1.08
recomm
1.03
recommends
1.03
recommendations
0.94
Recommend
0.94
recommended
0.93
recommendation
0.90
avorite
0.88
livest
0.88
Activations Density 0.014%