INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     lainnya
    -0.78
     الأخرى
    -0.67
    tidaknya
    -0.65
     berikutnya
    -0.64
    setupUi
    -0.63
     andern
    -0.63
    AddTagHelper
    -0.62
     restantes
    -0.61
     JSTOR
    -0.60
     autre
    -0.60
    POSITIVE LOGITS
    worldly
    1.54
     than
    1.23
    than
    0.93
    Than
    0.89
     niż
    0.87
     THAN
    0.80
    world
    0.78
    था
    0.77
     similar
    0.74
    wiſe
    0.72
    Act Density 0.119%

    No Known Activations