INDEX
Explanations
the term "latter" and its variations, indicating comparisons or contrasts in discussions
New Auto-Interp
Negative Logits
ernen
-0.16
rå
-0.15
nict
-0.14
vre
-0.14
ying
-0.14
uese
-0.14
اÙĨÙĩ
-0.13
ibraltar
-0.13
dal
-0.13
ÅĤ
-0.13
POSITIVE LOGITS
most
0.31
mentioned
0.23
-most
0.20
-day
0.20
beiden
0.19
mentioned
0.19
category
0.17
-described
0.17
lain
0.17
two
0.16
Activations Density 0.036%