INDEX
Explanations
references to oaths, vows, and commitments
New Auto-Interp
Negative Logits
.metamodel
-0.17
hread
-0.16
opis
-0.16
ordion
-0.16
889
-0.15
ÐIJÑĢÑħÑĸв
-0.15
Insensitive
-0.14
mÃŃt
-0.14
گرÛĮ
-0.14
AutoSize
-0.14
POSITIVE LOGITS
oath
0.47
swearing
0.35
sworn
0.34
swear
0.33
o
0.32
vows
0.31
pledge
0.31
vow
0.31
èª
0.29
pledges
0.27
Activations Density 0.061%