INDEX
    Explanations

    references to historical figures and events, especially from ancient Rome

    New Auto-Interp
    Negative Logits
    ROY
    -0.15
    ellen
    -0.15
    ãĥ¥ãĥ¼
    -0.15
     견
    -0.14
     Chim
    -0.14
    ialias
    -0.14
    iali
    -0.14
    CodeGen
    -0.13
    pek
    -0.13
    _Grid
    -0.13
    POSITIVE LOGITS
     Pub
    0.27
    Pub
    0.24
     Quint
    0.22
     Sext
    0.22
     Це
    0.21
     Brut
    0.20
     Marcus
    0.20
    Tit
    0.20
     Tit
    0.20
     Serv
    0.20
    Act Density 0.009%

    No Known Activations