INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     corte
    0.50
     sort
    0.48
     neurologic
    0.46
     pathologic
    0.46
     menace
    0.44
     estate
    0.44
     terra
    0.43
     circa
    0.43
     carnaval
    0.43
     practise
    0.43
    POSITIVE LOGITS
     XNUMX
    0.78
     ​​
    0.66
     և
    0.62
     !,
    0.59
     ”,
    0.57
    which
    0.57
     ،
    0.53
    0.53
     ؛
    0.52
    ​​
    0.52
    Act Density 0.003%

    No Known Activations