INDEX
Explanations
provide instructions or content
New Auto-Interp
Negative Logits
ند
1.03
as
1.01
لي
0.97
hvilket
0.96
但是
0.93
الص
0.92
يد
0.92
³.
0.89
९
0.89
mu
0.89
POSITIVE LOGITS
T
1.99
K
1.78
M
1.75
G
1.73
V
1.67
D
1.58
ב
1.55
W
1.53
R
1.52
H
1.51
Activations Density 0.145%