INDEX
Explanations
words related to legal proceedings and oaths
references to oaths or pledges of truthfulness
New Auto-Interp
Negative Logits
Stras
-0.72
========
-0.71
iga
-0.69
Sa
-0.68
aple
-0.68
nesota
-0.66
ppa
-0.65
Mini
-0.65
SEN
-0.65
Cater
-0.64
POSITIVE LOGITS
oath
1.34
Oath
1.11
sworn
1.01
ylum
0.89
breaker
0.87
vow
0.87
swearing
0.84
naire
0.82
thouse
0.82
bringer
0.79
Activations Density 0.008%