INDEX
    Explanations

    references to awards or recognitions in films

    New Auto-Interp
    Negative Logits
     تضيفلها
    -0.90
    الحياه
    -0.83
     beginnetje
    -0.82
    AxisAlignment
    -0.81
    StringCopy
    -0.81
    <bos>
    -0.80
     ब्रेकडाउन
    -0.80
    AddTagHelper
    -0.79
    יצוני
    -0.78
     Portail
    -0.78
    POSITIVE LOGITS
    ↵↵
    0.64
    0.47
     joint
    0.44
    ,
    0.43
    .
    0.41
    (
    0.41
    0.41
     -
    0.41
    的同时
    0.40
     =
    0.39
    Act Density 0.865%

    No Known Activations