INDEX
    Explanations

    lieutenant followed by rank

    New Auto-Interp
    Negative Logits
    as
    -3.33
    In
    -2.95
    u
    -2.88
    N
    -2.86
    an
    -2.86
    in
    -2.80
    2
    -2.61
    J
    -2.58
    er
    -2.55
    Q
    -2.55
    POSITIVE LOGITS
    2.58
     '
    2.55
    🅃
    2.53
    🅣
    2.42
    2.38
    𓁹
    2.30
    2.22
    
    
    2.19
    2.17
    𝕿
    2.13
    Act Density 0.004%

    No Known Activations