INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <sub>
    0.40
    W
    0.40
    title
    0.39
    \*
    0.38
     লিপিবদ্ধ
    0.38
    B
    0.38
    0.38
    <sup>
    0.37
    0.37
    header
    0.37
    POSITIVE LOGITS
    ovjek
    0.42
    putnik
    0.41
    ર્ગ
    0.40
    0.40
    0.39
    🦳
    0.39
     गोविंद
    0.39
    0.38
    0.38
     άνθρω
    0.38
    Act Density 0.002%

    No Known Activations