INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -1.36
    .
    -1.36
    -1.30
     recently
    -1.18
     often
    -1.17
    }$.
    -1.13
     encontrou
    -1.13
    וּ
    -1.11
     parfois
    -1.10
    ַּ
    -1.10
    POSITIVE LOGITS
     will
    1.83
     sẽ
    1.37
     будет
    1.35
    žní
    1.27
     וְ
    1.27
    neous
    1.27
    taya
    1.23
    älde
    1.22
     cintur
    1.18
     reuni
    1.16
    Act Density 0.005%

    No Known Activations