INDEX
    Explanations

    words related to political figures and events

    New Auto-Interp
    Negative Logits
     KB
    -0.88
    çİĭ
    -0.79
     MB
    -0.79
    ç¥ŀ
    -0.78
    AGES
    -0.74
     670
    -0.72
     346
    -0.71
     Gear
    -0.70
     Bohem
    -0.70
    650
    -0.69
    POSITIVE LOGITS
    re
    1.36
    rez
    1.12
    RE
    1.12
    reb
    1.10
    arre
    0.99
    rey
    0.98
    reys
    0.97
    ère
    0.97
    rem
    0.96
    res
    0.93
    Act Density 0.134%

    No Known Activations