INDEX
Explanations
the word 'capital' and words nearby it
New Auto-Interp
Negative Logits
.
-0.96
a
-0.91
(
-0.80
,
-0.78
↵
-0.77
+
-0.75
↵↵
-0.71
to
-0.66
[
-0.66
is
-0.66
POSITIVE LOGITS
AddTagHelper
1.73
تضيفلها
1.66
виправивши
1.58
propOrder
1.54
EconPapers
1.51
"]);
1.41
__":
1.41
']))
1.39
__':
1.38
SequentialGroup
1.37
Activations Density 1.613%