INDEX
    Explanations

    instances of the letter "S" in various forms or contexts

    New Auto-Interp
    Negative Logits
    надлеж
    -0.24
    галÑĸ
    -0.23
    ÑĢогÑĢа
    -0.21
    пнÑı
    -0.20
    бÑĢÑı
    -0.19
    itters
    -0.18
    наÑģлÑĸд
    -0.17
    âĦĸâĦĸ
    -0.17
    иÑĨин
    -0.16
    ,
    -0.16
    POSITIVE LOGITS
    ÑĥÑīеÑģÑĤв
    0.26
    оглаÑģ
    0.25
    ÑĢеди
    0.24
    пиÑģок
    0.23
    егоднÑı
    0.23
    оÑģÑĤав
    0.23
    одеÑĢж
    0.23
    ейÑĩаÑģ
    0.22
    иÑģÑĤем
    0.22
    лÑĥж
    0.22
    Act Density 0.011%

    No Known Activations