INDEX
    Explanations

    repeated patterns or references to "loop" in various contexts

    New Auto-Interp
    Negative Logits
    ibs
    -0.18
    ebo
    -0.18
    quez
    -0.16
    onne
    -0.15
    lee
    -0.15
    onse
    -0.15
    .uf
    -0.15
    wig
    -0.14
     Cru
    -0.14
    ../
    -0.14
    POSITIVE LOGITS
    -loop
    0.27
    (loop
    0.21
     loop
    0.21
     Loop
    0.20
    Loop
    0.19
    loop
    0.19
     LOOP
    0.18
     loops
    0.17
    .loop
    0.17
    otron
    0.16
    Act Density 0.016%

    No Known Activations