INDEX
Explanations
expressions of gratitude or thanks
expressions of gratitude and thankfulness
New Auto-Interp
Negative Logits
opers
-0.74
okin
-0.74
oto
-0.67
opped
-0.63
change
-0.62
uve
-0.62
uter
-0.62
agate
-0.61
quer
-0.60
bill
-0.60
POSITIVE LOGITS
giving
1.22
fulness
0.88
ctuary
0.85
acknowled
0.84
citiz
0.82
gements
0.80
deity
0.80
God
0.79
gments
0.78
pardon
0.78
Activations Density 0.042%