INDEX
    Explanations

    HTML tags and their attributes

    New Auto-Interp
    Negative Logits
     var
    -0.52
    thered
    -0.44
     qu
    -0.43
    ugo
    -0.43
    -0.42
     influence
    -0.42
     true
    -0.42
    кри
    -0.42
    NELL
    -0.41
    </i>
    -0.41
    POSITIVE LOGITS
     pinulongan
    0.78
     للمعارف
    0.76
    Хьажоргаш
    0.73
    +#+#
    0.73
     الحره
    0.73
    ſelf
    0.71
    Condol
    0.70
     ―――――
    0.69
    
    0.68
    ValueStyle
    0.66
    Act Density 0.003%

    No Known Activations