INDEX
    Explanations

    specific phrases related to the number two and its variations

    New Auto-Interp
    Negative Logits
     two
    -0.18
    两个
    -0.16
     twee
    -0.16
    äºĮ人
    -0.16
     beiden
    -0.15
     vat
    -0.15
    hai
    -0.15
     Nos
    -0.14
     both
    -0.14
    inkle
    -0.14
    POSITIVE LOGITS
    -One
    0.18
    ména
    0.18
    vier
    0.17
    iences
    0.17
    ä¹ĭä¸Ģ
    0.17
    ToOne
    0.16
     birden
    0.16
    igne
    0.15
    åIJĮæĹ¶
    0.15
    fram
    0.15
    Act Density 0.107%

    No Known Activations