INDEX
Negative Logits
パネル
-0.92
Bt
-0.89
beträ
-0.84
وسی
-0.81
одном
-0.81
върху
-0.80
credibility
-0.80
явля
-0.80
𝜔
-0.78
關於
-0.78
POSITIVE LOGITS
thankful
1.70
grateful
1.51
thanking
1.48
thanked
1.38
for
1.37
agradec
1.34
Thanksgiving
1.30
Thanks
1.21
gratitude
1.20
agrade
1.20
Activations Density 0.005%