INDEX
Explanations
expressions of gratitude
Thanking or greeting someone
thank you for
New Auto-Interp
Negative Logits
Houſe
-0.97
Diſ
-0.88
Majefty
-0.87
Conſ
-0.84
Reſ
-0.84
houſe
-0.83
myſelf
-0.83
Skocz
-0.81
ſche
-0.81
pleaſure
-0.81
POSITIVE LOGITS
for
1.00
so
0.57
frumos
0.56
very
0.54
davvero
0.51
sincerely
0.51
!
0.51
ért
0.51
Thankyou
0.49
ancora
0.49
Activations Density 0.033%