INDEX
    Explanations

    else branching after condition

    New Auto-Interp
    Negative Logits
    𝘔
    2.02
    𝐌
    1.98
    Ру
    1.80
    𝐑
    1.78
    РО
    1.74
    1.73
    Мак
    1.70
    1.70
    Рабо
    1.70
    Pré
    1.68
    POSITIVE LOGITS
    1.79
    ,
    1.62
    e
    1.60
    г
    1.50
    льский
    1.38
    0
    1.38
    glow
    1.37
    ↵↵
    1.30
    v
    1.30
     (
    1.27
    Act Density 0.013%

    No Known Activations