INDEX
Explanations
occurrences of conjunctions and relevant phrases indicating comparison or relationship
New Auto-Interp
Negative Logits
oka
-0.17
udit
-0.15
oji
-0.15
angel
-0.15
odos
-0.15
ippi
-0.14
ulas
-0.14
eve
-0.14
cesso
-0.14
neust
-0.14
POSITIVE LOGITS
beyond
0.26
the
0.23
its
0.22
Beyond
0.21
/as
0.20
Beyond
0.19
Its
0.18
eyond
0.18
/of
0.17
their
0.16
Activations Density 0.113%