INDEX
    Explanations

    most common and versatile

    New Auto-Interp
    Negative Logits
     very
    0.57
     rất
    0.57
     sehr
    0.56
     дуже
    0.55
     बहुत
    0.55
     очень
    0.54
     foarte
    0.54
     muy
    0.54
    बहुत
    0.53
     bardzo
    0.52
    POSITIVE LOGITS
     arguably
    0.50
     least
    0.41
     wohl
    0.40
     probabilmente
    0.40
     wealthiest
    0.39
    argu
    0.38
     rarest
    0.38
    probably
    0.38
     наверное
    0.38
    Probably
    0.38
    Act Density 0.047%

    No Known Activations