INDEX
Explanations
relative pronouns or possessives
New Auto-Interp
Negative Logits
is
0.56
footprint
0.50
on
0.45
in
0.44
footprints
0.44
ையை
0.43
X
0.43
<img>
0.43
=
0.43
ing
0.42
POSITIVE LOGITS
whom
0.55
الذين
0.55
Whom
0.50
जिनकी
0.47
हत्याकांड
0.47
wszel
0.46
是谁
0.46
whose
0.46
كلهم
0.44
cuyos
0.43
Activations Density 0.066%