INDEX
    Explanations

    words or phrases related to procedural steps or instructions

    New Auto-Interp
    Negative Logits
    tahankan
    -0.91
    olesale
    -0.88
     Barbier
    -0.87
    "])
    
    -0.86
    }});
    -0.83
    ividual
    -0.81
    -0.80
     všem
    -0.78
     rubia
    -0.78
    |}{}
    -0.77
    POSITIVE LOGITS
     step
    2.69
     STEP
    2.56
     Step
    2.53
    Step
    2.50
    step
    2.44
     steps
    2.43
     Steps
    2.25
    STEP
    2.25
     STEPS
    2.06
    steps
    2.06
    Act Density 0.049%

    No Known Activations