INDEX
Explanations
sentences expressing experiences or accomplishments
positive expressions related to personal experiences and achievements
New Auto-Interp
Negative Logits
erv
-0.70
reference
-0.65
ague
-0.65
ifice
-0.58
azard
-0.58
abus
-0.58
orno
-0.58
ucc
-0.57
oof
-0.57
farious
-0.57
POSITIVE LOGITS
improved
0.94
wonderful
0.88
rewarded
0.86
revital
0.85
amazing
0.85
rejuven
0.85
marvelous
0.83
thank
0.83
splendid
0.83
outper
0.82
Activations Density 1.586%