INDEX
Explanations
Harvard university and business
New Auto-Interp
Negative Logits
1.63
ك
1.21
羅
0.91
اي
0.90
지를
0.90
ли
0.88
ки
0.84
бо
0.82
中有
0.82
国的
0.81
POSITIVE LOGITS
in
1.55
s
1.41
r
1.38
st
1.31
f
1.19
ay
1.14
a
1.14
,
1.12
ad
1.09
us
1.08
Activations Density 0.001%