INDEX
Explanations
mentions of congratulations
expressions of congratulations and celebratory sentiments
New Auto-Interp
Negative Logits
blacklist
-0.80
inconsist
-0.77
persecut
-0.74
rors
-0.74
prosecutions
-0.73
Viol
-0.73
Unless
-0.72
illegal
-0.72
bankrupt
-0.71
icides
-0.71
POSITIVE LOGITS
congratulated
1.53
thanked
1.43
compliment
1.35
joked
1.28
hugged
1.25
cheered
1.22
exclaimed
1.21
congrat
1.20
applauded
1.17
thanking
1.16
Activations Density 0.627%