INDEX
Explanations
official titles or positions
references to officials or official statements
New Auto-Interp
Negative Logits
ï¸
-1.00
ĸļ
-0.91
esville
-0.84
xual
-0.73
âķIJ
-0.72
âķIJâķIJ
-0.72
oppers
-0.72
avery
-0.72
Bengal
-0.70
Beg
-0.69
POSITIVE LOGITS
official
0.84
dom
0.81
ially
0.80
sanctioned
0.80
announcement
0.74
fiat
0.73
iating
0.71
wcs
0.70
pronounce
0.70
documentation
0.68
Activations Density 0.021%