INDEX
Explanations
acquiring or obtaining something
New Auto-Interp
Negative Logits
с
1.62
ス
1.45
える
1.32
ナ
1.27
س
1.26
ب
1.23
ین
1.21
ра
1.16
ia
1.13
1
1.09
POSITIVE LOGITS
'
1.89
↵
1.61
on
1.26
ri
1.22
u
1.11
ro
1.10
x
1.08
)
1.07
m
1.06
w
1.06
Activations Density 0.165%