INDEX
Explanations
names followed by associated terms
New Auto-Interp
Negative Logits
ਾ
1.16
ه
1.09
↵
1.04
%
1.01
د
1.00
ల
0.99
h
0.98
_
0.97
৫
0.97
or
0.96
POSITIVE LOGITS
𝐨
1.24
تری
1.10
𝐢
1.09
تين
1.08
1
1.06
、
1.05
تقديم
1.05
𝐜
1.05
магнит
1.03
ت
1.03
Activations Density 0.005%