INDEX
Explanations
details related to comparisons or contrasts in narratives
New Auto-Interp
Negative Logits
assi
-0.19
ãģŁãĤī
-0.17
yonel
-0.16
ijd
-0.15
.toolbox
-0.15
ãĤĤãģĨ
-0.14
iasi
-0.14
rotterdam
-0.14
antz
-0.14
ยม
-0.14
POSITIVE LOGITS
respectively
0.37
latter
0.36
respective
0.28
both
0.27
each
0.26
former
0.25
neither
0.24
each
0.24
both
0.24
Both
0.23
Activations Density 0.339%