INDEX
    Explanations

    specific nouns and proper nouns, particularly names and identifiers

    New Auto-Interp
    Negative Logits
    Autoritní
    -0.78
     فريبيس
    -0.75
     jsPsych
    -0.75
    ftagPool
    -0.71
     للمعارف
    -0.71
     للاسماء
    -0.71
    دانشنامهٔ
    -0.70
    ArgsConstructor
    -0.70
     ModelExpression
    -0.70
     تضيفلها
    -0.69
    POSITIVE LOGITS
     szint
    0.52
    Another
    0.47
    another
    0.46
     berikutnya
    0.45
    ALC
    0.44
     another
    0.44
    ordning
    0.43
    ↵↵
    0.42
    avons
    0.42
    greenrobot
    0.41
    Act Density 0.539%

    No Known Activations