INDEX
    Explanations

    names followed by common conversational words

    punctuation

    New Auto-Interp
    Negative Logits
    Viitteet
    -0.48
    $_
    -0.46
    form
    -0.45
    HomeAsUpEnabled
    -0.45
     par
    -0.43
    -0.42
     loadImage
    -0.42
     form
    -0.41
     regularity
    -0.41
    ահ
    -0.41
    POSITIVE LOGITS
    ✨:
    0.74
     ویکی‌پدیای
    0.64
     kasarigan
    0.59
    writeField
    0.59
     وتسجيلات
    0.58
    tagHelperRunner
    0.57
    CloseOperation
    0.55
    Tikang
    0.54
    الحياه
    0.54
     الرياضيه
    0.54
    Act Density 0.128%

    No Known Activations