INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     is
    -1.44
     For
    -1.42
    usda
    -1.38
     webbplats
    -1.38
     for
    -1.38
    templateUrl
    -1.36
    ruguay
    -1.35
     on
    -1.35
     ficou
    -1.33
     что
    -1.32
    POSITIVE LOGITS
    1
    1.52
     ſaid
    1.47
     殼
    1.45
     Kristus
    1.41
    Parce
    1.41
     ftate
    1.37
     temen
    1.34
     火鍋
    1.34
    Réponse
    1.34
     struktur
    1.33
    Act Density 0.014%

    No Known Activations