INDEX
Explanations
discourse surrounding promises and pledges made by individuals, particularly in political contexts
New Auto-Interp
Negative Logits
vej
-0.15
Lace
-0.14
late
-0.14
anda
-0.14
æķ£
-0.13
rez
-0.13
_RCC
-0.13
frage
-0.13
Ļ
-0.13
NBC
-0.13
POSITIVE LOGITS
end
0.16
asmine
0.15
Bras
0.15
ürn
0.15
uent
0.15
roti
0.15
undy
0.15
اÙĦتÙĤ
0.14
iesen
0.14
-La
0.14
Activations Density 0.378%