INDEX
    Explanations

    phrases indicating location or position

    New Auto-Interp
    Negative Logits
    %.
    -0.49
    -0.45
    .
    
    -0.44
    }.
    -0.44
    ++.
    -0.42
    ".
    -0.42
    ().
    -0.41
    +.
    -0.41
    ].
    -0.41
    ).
    -0.40
    POSITIVE LOGITS
    CloseOperation
    0.76
    ients
    0.72
     autorytatywna
    0.68
    habits
    0.67
     queſta
    0.66
    ſelben
    0.65
    pires
    0.64
    ftagPool
    0.63
     GenerationType
    0.60
     ComVisible
    0.60
    Act Density 1.200%

    No Known Activations