INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ﯽ
1.16
for
1.12
circulate
1.10
ر
1.09
er
1.07
ר
1.02
at
1.01
ﮯ
1.01
fight
0.98
ﻰ
0.93
POSITIVE LOGITS
c
0.89
k
0.87
القرآن
0.86
0.85
kval
0.83
_
0.80
نوع
0.80
kter
0.79
argued
0.77
أغسطس
0.77
Activations Density 0.000%