INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     jati
    -0.61
     nikah
    -0.59
    gaja
    -0.58
    celona
    -0.56
     adil
    -0.55
    baya
    -0.53
     jaya
    -0.51
    jela
    -0.50
    jati
    -0.50
    jaja
    -0.50
    POSITIVE LOGITS
    man
    1.11
    MAN
    0.99
    Man
    0.87
     Man
    0.82
     MAN
    0.78
    mans
    0.76
     man
    0.73
    iman
    0.71
    Mans
    0.68
    eman
    0.67
    Act Density 0.139%

    No Known Activations