INDEX
    Explanations

    acknowledging interjections

    New Auto-Interp
    Negative Logits
    1.44
    1.38
    ن
    1.35
    ית
    1.31
    ן
    1.23
    на
    1.20
    1.20
     
    1.19
    et
    1.18
     possessions
    1.17
    POSITIVE LOGITS
    ٘
    1.61
    سی
    1.49
    𝒍
    1.45
    𝐨
    1.40
    theit
    1.37
    1.37
    1.36
    1.35
    𝑳
    1.35
    quando
    1.34
    Act Density 0.618%

    No Known Activations