INDEX
Explanations
references to religious texts and studies
New Auto-Interp
Negative Logits
ALAR
-0.17
举
-0.15
predecess
-0.15
icas
-0.15
unordered
-0.14
hire
-0.14
shapes
-0.14
rád
-0.14
iba
-0.14
unordered
-0.13
POSITIVE LOGITS
ãĥªãĥ³ãĤ°
0.19
_RET
0.16
Ring
0.15
lsa
0.15
ring
0.15
andır
0.15
Kurul
0.14
det
0.14
onet
0.14
osa
0.14
Activations Density 0.230%