INDEX
Explanations
references to Semitic languages and their associated terms
New Auto-Interp
Negative Logits
Par
-0.65
gh
-0.64
Per
-0.64
on
-0.60
<eos>
-0.59
–
-0.58
zz
-0.57
Ver
-0.57
D
-0.56
T
-0.56
POSITIVE LOGITS
Hebrew
1.26
Hebrew
1.20
RenderAtEndOf
1.16
Hebrews
1.09
myſelf
1.03
Theſe
0.96
་་
0.96
Malayalam
0.91
Hebrews
0.90
engertian
0.89
Activations Density 0.003%