INDEX
    Explanations

    proper nouns or names, particularly those of people

    New Auto-Interp
    Negative Logits
    ']],
    -0.92
     }}$}
    -0.89
    ".
    
    -0.86
    '],
    
    -0.85
    )++;
    -0.84
    "]];
    -0.84
    ')],
    -0.83
    ."]
    -0.82
    ))));
    -0.82
    '>";
    -0.82
    POSITIVE LOGITS
     Jr
    0.64
    Przypisy
    0.55
    ,
    0.55
     (
    0.54
    '
    0.46
    1
    0.46
     &
    0.46
    (
    0.46
    mann
    0.44
    b
    0.44
    Act Density 0.375%

    No Known Activations