INDEX
Explanations
references to administrative roles and tasks
occurrences of the word "admin" and its variations
New Auto-Interp
Negative Logits
Reloaded
-0.76
lihood
-0.74
False
-0.73
enegger
-0.72
¶æ
-0.71
IMAGES
-0.68
Norn
-0.67
Empires
-0.65
çīĪ
-0.65
Temperature
-0.65
POSITIVE LOGITS
stration
1.34
admin
1.09
admin
1.04
isters
1.03
uthor
0.97
ister
0.97
administ
0.93
istry
0.89
rative
0.86
Admin
0.85
Activations Density 0.013%