INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ف
    0.39
    ز
    0.35
    0.34
    ปี
    0.33
    ست
    0.31
    ם
    0.31
    ه
    0.30
    ν
    0.29
    م
    0.29
    را
    0.29
    POSITIVE LOGITS
     a
    0.40
     at
    0.37
     of
    0.37
     hapless
    0.34
     
    0.34
    л
    0.32
    <unused2137>
    0.32
    <unused1145>
    0.30
    してた
    0.30
    =");
    0.29
    Act Density 0.000%

    No Known Activations