INDEX
    Explanations

    references to specific entities or legal terms

    New Auto-Interp
    Negative Logits
     تضيفلها
    -0.65
    زای
    -0.56
    %@",
    -0.54
     %@",
    -0.52
    '],
    
    -0.50
    mallow
    -0.49
    parag
    -0.49
    رای
    -0.49
     kasarigan
    -0.48
    )":
    -0.48
    POSITIVE LOGITS
     ها
    1.05
    ‌ها
    0.93
    ها
    0.81
     روستا
    0.62
    وها
    0.57
    تها
    0.54
    يتها
    0.49
     nanti
    0.49
     and
    0.48
    برها
    0.48
    Act Density 0.004%

    No Known Activations