INDEX
    Explanations

    words related to political figures or events

    proper nouns and names, particularly those associated with events and public figures

    New Auto-Interp
    Negative Logits
     Inquis
    -1.02
     FE
    -1.00
     Fei
    -0.96
    FE
    -0.91
     Ez
    -0.83
    isf
    -0.83
     FD
    -0.83
    fe
    -0.80
     Flip
    -0.78
     Flo
    -0.78
    POSITIVE LOGITS
    rams
    0.98
    roma
    0.85
    rom
    0.81
    haar
    0.79
    RAM
    0.79
    SAM
    0.78
     Bav
    0.78
    ARA
    0.78
    ram
    0.78
    Ram
    0.78
    Act Density 0.606%

    No Known Activations