INDEX
    Explanations

    reconciling opposing concepts

    New Auto-Interp
    Negative Logits
    س
    1.13
    та
    1.02
    ti
    0.98
    ta
    0.96
    ts
    0.93
    ty
    0.86
    ter
    0.85
    s
    0.83
    ten
    0.82
    sh
    0.82
    POSITIVE LOGITS
    {
    0.85
    (
    0.83
    ان
    0.82
    0.80
    0.79
    บุ
    0.72
    0.66
    0.64
    もら
    0.63
    an
    0.62
    Act Density 0.003%

    No Known Activations