INDEX
Explanations
phrases related to authority and permission
New Auto-Interp
Negative Logits
opers
-0.72
�醒
-0.72
earch
-0.65
ight
-0.64
————
-0.63
heast
-0.61
Trash
-0.61
�
-0.60
etheless
-0.60
Mines
-0.60
POSITIVE LOGITS
authorization
0.89
designation
0.80
moniker
0.79
reservation
0.76
avail
0.71
ausp
0.71
author
0.70
account
0.69
lication
0.67
distinction
0.67
Activations Density 1.045%