INDEX
    Explanations

    Gemma, llama, LaMDA, LGA

    tokens that are named entities or proper nouns (product/model names, people, places, and other capitalized terms).

    New Auto-Interp
    Negative Logits
    l
    1.92
    k
    1.57
    t
    1.55
    n
    1.55
    r
    1.42
    lari
    1.41
    j
    1.40
    1.37
    یم
    1.29
    il
    1.23
    POSITIVE LOGITS
    1.64
    2
    1.51
     σε
    1.45
    1.32
    та
    1.30
     في
    1.30
    ாதி
    1.30
    ا
    1.27
    ة
    1.24
    ্শন
    1.22
    Act Density 0.769%

    No Known Activations