INDEX
    Explanations

    pathways and connections

    New Auto-Interp
    Negative Logits
    ни
    0.78
    м
    0.74
    ש
    0.67
    يل
    0.64
    \
    0.60
    dır
    0.59
    мм
    0.58
    ния
    0.57
    2
    0.57
    то
    0.57
    POSITIVE LOGITS
     atheros
    0.61
    ing
    0.58
     i
    0.54
     e
    0.52
    าค
    0.52
    0.52
    0.50
     όπως
    0.50
     o
    0.49
     autor
    0.49
    Act Density 0.487%

    No Known Activations