INDEX
    Explanations

    uncertainty and articulation

    New Auto-Interp
    Negative Logits
     Usual
    0.90
     রাখুন
    0.89
     bekannten
    0.89
     ваши
    0.88
    你需要
    0.88
     Consideration
    0.88
     bekannte
    0.87
     Considerations
    0.86
     principales
    0.84
    Như
    0.84
    POSITIVE LOGITS
     exact
    1.76
     exactly
    1.74
     exacte
    1.40
    exactly
    1.38
     precisely
    1.37
     precise
    1.33
    exact
    1.33
     precies
    1.30
    究竟
    1.23
     Exactly
    1.23
    Act Density 0.164%

    No Known Activations