INDEX
    Explanations

    symbols and punctuation marks

    New Auto-Interp
    Negative Logits
    ers
    -0.73
    te
    -0.64
     الشرق
    -0.61
    osh
    -0.60
    er
    -0.60
     Dol
    -0.60
     Thy
    -0.59
    (
    -0.59
     dol
    -0.58
     Goy
    -0.57
    POSITIVE LOGITS
    }))
    1.29
    ]")]
    1.24
    }))
    
    1.17
    ])
    1.16
     referenties
    1.12
    })]
    1.11
    '])
    1.10
    })
    1.09
    "]))
    1.09
    ])]
    1.08
    Act Density 0.860%

    No Known Activations