INDEX
    Explanations

    structured formatting elements in text

    New Auto-Interp
    Negative Logits
    arch
    -0.15
     moy
    -0.14
    lou
    -0.14
    moil
    -0.14
    cow
    -0.14
     Arrow
    -0.14
    elia
    -0.14
    urch
    -0.13
    .stopPropagation
    -0.13
    l
    -0.13
    POSITIVE LOGITS
    figure
    0.25
     figure
    0.22
    flush
    0.18
    center
    0.16
    rys
    0.15
    enumerate
    0.15
    spacing
    0.15
    Flush
    0.15
    zier
    0.15
    ÑĨеп
    0.14
    Act Density 0.033%

    No Known Activations