INDEX
    Explanations

    front lines and frontrunners

    New Auto-Interp
    Negative Logits
     in
    1.20
    ן
    1.16
    ע
    1.16
     of
    1.14
     is
    1.13
    л
    1.08
    и
    1.04
    v
    1.03
    ים
    1.02
    ]
    0.92
    POSITIVE LOGITS
    Авто
    0.96
    Ро
    0.95
    Три
    0.95
    Ни
    0.95
    Пер
    0.94
    ن
    0.94
    Сер
    0.91
    На
    0.90
    Ра
    0.90
    Ин
    0.90
    Act Density 0.004%

    No Known Activations