INDEX
    Explanations

    the word 'capital' and words nearby it

    New Auto-Interp
    Negative Logits
    .
    -0.96
    a
    -0.91
    (
    -0.80
    ,
    -0.78
    -0.77
    +
    -0.75
    ↵↵
    -0.71
    to
    -0.66
    [
    -0.66
    is
    -0.66
    POSITIVE LOGITS
    AddTagHelper
    1.73
     تضيفلها
    1.66
     виправивши
    1.58
     propOrder
    1.54
     EconPapers
    1.51
    "]);
    
    1.41
    __":
    
    1.41
    ']))
    
    1.39
    __':
    
    1.38
    SequentialGroup
    1.37
    Act Density 1.613%

    No Known Activations