INDEX
Explanations
references to individuals, particularly the name "Abdul."
New Auto-Interp
Negative Logits
uds
-0.19
irie
-0.17
UD
-0.15
yers
-0.15
abel
-0.15
inary
-0.15
arity
-0.15
=<?=$
-0.14
dashed
-0.14
esen
-0.14
POSITIVE LOGITS
ÑĢаÑħ
0.24
Rahman
0.23
atif
0.23
Lat
0.21
Az
0.20
raz
0.20
lat
0.20
Az
0.20
raham
0.20
az
0.20
Activations Density 0.007%