INDEX
Explanations
manufacturing and factories
New Auto-Interp
Negative Logits
were
0.63
and
0.53
d
0.52
d
0.51
υ
0.49
4
0.48
ی
0.47
the
0.47
ma
0.46
outcrop
0.45
POSITIVE LOGITS
Factories
0.45
綘
0.45
బాద్
0.44
honti
0.42
工場
0.42
妗
0.41
години
0.41
пра
0.40
ON
0.40
)。
0.40
Activations Density 0.001%