INDEX
    Explanations

    conjunctions and transitional phrases indicating contrast or exception

    New Auto-Interp
    Negative Logits
    LEncoder
    -0.92
     AssemblyProduct
    -0.82
     ModelExpression
    -0.77
    
    -0.73
     purpoſe
    -0.73
    ,:);
    -0.73
     myſelf
    -0.72
     }}$}
    -0.71
    Переваги
    -0.70
     Efq
    -0.70
    POSITIVE LOGITS
     yet
    0.58
     but
    0.54
    <bos>
    0.53
     S
    0.53
     one
    0.51
     dés
    0.51
     الز
    0.44
    S
    0.44
    ment
    0.44
    0.44
    Act Density 0.290%

    No Known Activations