INDEX
Explanations
references to high-ranking officials or positions of authority
New Auto-Interp
Negative Logits
ubits
-0.16
ibilit
-0.16
ög
-0.15
ctype
-0.14
jing
-0.14
psilon
-0.14
ynch
-0.14
ë§ģ
-0.14
agic
-0.14
orian
-0.14
POSITIVE LOGITS
dom
0.24
lain
0.22
executive
0.21
ëª
0.21
Whip
0.21
Executive
0.19
Operating
0.19
isay
0.19
nut
0.19
-exec
0.19
Activations Density 0.015%