INDEX
    Explanations

    references to political figures and government positions in various contexts

    New Auto-Interp
    Negative Logits
     colonel
    -0.52
     doctor
    -0.50
     king
    -0.47
     queen
    -0.46
    doctor
    -0.46
     mr
    -0.45
    先生
    -0.44
     prince
    -0.44
     miss
    -0.42
     sultan
    -0.42
    POSITIVE LOGITS
     Acting
    1.20
     Assistant
    1.02
    Acting
    0.97
     Deputy
    0.96
     Vice
    0.86
     Interim
    0.84
     Secretary
    0.84
     Associate
    0.83
     acting
    0.81
     Chief
    0.80
    Act Density 0.567%

    No Known Activations