INDEX
Explanations
Detroit city and its associations
New Auto-Interp
Negative Logits
0
2.27
ות
1.63
0
1.34
by
1.31
for
1.28
as
1.26
u
1.26
ן
1.22
but
1.21
from
1.20
POSITIVE LOGITS
ل
1.08
ल
0.98
Detroit
0.97
ל
0.96
Det
0.94
しっかりと
0.93
म
0.93
욌
0.92
ren
0.92
м
0.92
Activations Density 0.006%