INDEX
    Explanations

    base level or as a foundation

    New Auto-Interp
    Negative Logits
    ین
    0.78
    ар
    0.72
    ю
    0.70
     уви
    0.68
     zacz
    0.67
    لي
    0.66
    я
    0.66
    пробу
    0.63
    ید
    0.63
     испыта
    0.63
    POSITIVE LOGITS
     Base
    1.44
     base
    1.29
    Base
    1.27
    base
    1.18
     for
    1.05
     BASE
    1.05
    BASE
    0.90
     बेस
    0.90
     Bases
    0.90
    ฐาน
    0.87
    Act Density 0.042%

    No Known Activations