INDEX
    Explanations

    phrases indicating repetition or emphasizing additional examples

    New Auto-Interp
    Negative Logits
    AutoresizingMask
    -0.94
     Efq
    -0.85
    IsMutable
    -0.79
    DoubleQuotes
    -0.79
    ChromeDriver
    -0.78
    MessageOf
    -0.78
     Monfieur
    -0.78
    AddTagHelper
    -0.76
     poffe
    -0.76
    aarrggbb
    -0.75
    POSITIVE LOGITS
     Just
    0.51
      
    0.47
     ‘
    0.47
    0.47
    gud
    0.46
     Mar
    0.45
     AllMovie
    0.45
     just
    0.44
    Lur
    0.44
     برانيه
    0.43
    Act Density 0.093%

    No Known Activations