INDEX
    Explanations

    proper nouns related to individuals and government figures

    New Auto-Interp
    Negative Logits
    ":[
    -0.66
    ':
    -0.66
    ascus
    -0.65
    Interest
    -0.64
    .....
    -0.61
    ciplinary
    -0.61
    .......
    -0.61
    "))
    -0.61
    â̦..
    -0.58
    hend
    -0.58
    POSITIVE LOGITS
    !).
    1.24
    ?).
    1.22
    !),
    1.12
    ?),
    1.02
    !)
    0.97
    ?)
    0.95
    ).
    0.88
     ).
    0.88
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    0.83
    )?
    0.82
    Act Density 0.817%

    No Known Activations