INDEX
    Explanations

    references to historical figures and events, particularly related to specific monarchs and their reigns

    New Auto-Interp
    Negative Logits
    istory
    -0.16
    abic
    -0.15
    .fac
    -0.14
    usunda
    -0.14
    ä¾į
    -0.13
    eer
    -0.13
    ανδ
    -0.13
    DA
    -0.13
    ute
    -0.12
    ears
    -0.12
    POSITIVE LOGITS
     himself
    0.19
    ÑĢовиÑĩ
    0.17
     Magnus
    0.15
    -dropdown
    0.15
    дейÑģÑĤв
    0.14
     son
    0.14
    ROW
    0.14
    son
    0.14
     who
    0.14
    Overlap
    0.14
    Act Density 0.095%

    No Known Activations