INDEX
    Explanations

    numbers followed by Billion or percent

    New Auto-Interp
    Negative Logits
    ों
    0.48
    s
    0.42
    ים
    0.39
    0
    0.38
    k
    0.37
     Understanding
    0.35
     Abstracts
    0.35
     Headphones
    0.35
     Epidemiology
    0.34
     Algorithms
    0.34
    POSITIVE LOGITS
    ли
    0.52
    0.49
     statunitense
    0.43
    р
    0.43
     despre
    0.41
     világ
    0.40
    もら
    0.40
    на
    0.39
     victoire
    0.39
    an
    0.39
    Act Density 0.585%

    No Known Activations