INDEX
    Explanations

    high-frequency pronouns and common linking words that connect thoughts or statements

    New Auto-Interp
    Negative Logits
    \{\\
    -0.65
    InjectAttribute
    -0.65
     '\\;'
    -0.61
     laikā
    -0.56
     ویکی‌آمباردا
    -0.56
     kysy
    -0.55
     zelve
    -0.55
    достатки
    -0.54
    cartão
    -0.54
    ỡng
    -0.54
    POSITIVE LOGITS
    ंदीखरीदारी
    0.40
     unen
    0.39
    )--(
    0.39
    ">—
    0.39
     nakalista
    0.39
    ukone
    0.38
    >//
    0.38
    --
    0.38
    <bos>
    0.37
     raiſ
    0.36
    Act Density 0.082%

    No Known Activations