INDEX
    Explanations

    significant events and discussions related to societal and historical contexts

    New Auto-Interp
    Negative Logits
    ÐľÐŀ
    -0.16
    etail
    -0.15
    Backing
    -0.14
    leigh
    -0.14
    oling
    -0.14
    ANEL
    -0.14
    eper
    -0.13
    prav
    -0.13
    ãĥ³ãĤ¸
    -0.13
    UNCTION
    -0.13
    POSITIVE LOGITS
     devant
    0.67
     before
    0.67
    before
    0.59
     пеÑĢед
    0.54
     Before
    0.53
     front
    0.50
    Before
    0.50
    -before
    0.48
     przed
    0.48
    _before
    0.47
    Act Density 0.449%

    No Known Activations