INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.72
    TextUtils
    0.70
    0.68
    0.67
    ('
    0.67
     :-)
    0.67
    )}(
    0.66
    ());
    0.66
    0.66
    ("
    0.65
    POSITIVE LOGITS
     [
    5.56
    [
    4.58
     $[
    3.82
     \[
    3.57
     [-
    3.50
    -[
    3.49
     [_
    3.45
     [$
    3.44
     [\
    3.40
     [,
    3.38
    Act Density 1.006%

    No Known Activations