INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Wagner
    -0.10
     Vend
    -0.10
    èİ«
    -0.10
     gonna
    -0.10
     wan
    -0.09
    atura
    -0.09
    IRR
    -0.08
     Clair
    -0.08
    ought
    -0.08
     thirst
    -0.08
    POSITIVE LOGITS
     want
    0.16
     muá»ijn
    0.16
     wants
    0.16
     wanted
    0.16
     ingin
    0.14
    رÙĬد
    0.13
    à¹īà¸Ńà¸ĩà¸ģาร
    0.13
     wish
    0.13
    è¦ģ
    0.12
     ÑħоÑĤ
    0.11
    Act Density 0.037%

    No Known Activations