INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ка
    1.07
    ك
    0.97
     في
    0.86
     إنه
    0.81
     в
    0.77
    𝟬
    0.75
    да
    0.73
     gesturing
    0.73
     ehemal
    0.71
    ካከል
    0.69
    POSITIVE LOGITS
    ע
    0.95
    z
    0.91
    0.88
     and
    0.87
     avuto
    0.80
    RA
    0.80
    h
    0.79
    b
    0.78
    -
    0.76
     has
    0.75
    Act Density 1.202%

    No Known Activations