INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Efq
    -0.93
     Eſ
    -0.91
     raiſ
    -0.86
     Majefty
    -0.83
    ensement
    -0.82
     Theſe
    -0.81
     difp
    -0.81
     ſtate
    -0.80
     Jefus
    -0.77
    neſs
    -0.76
    POSITIVE LOGITS
    ViewFeatures
    0.84
    berg
    0.55
    حدى
    0.54
    abetes
    0.54
    0.53
     أحد
    0.53
    ,
    0.52
    [[
    0.48
    BERG
    0.48
     إحدى
    0.47
    Act Density 0.091%

    No Known Activations