INDEX
    Explanations

    numerical information such as ages or quantities

    references to specific ages, demographics, or notable individuals in discussions

    New Auto-Interp
    Negative Logits
     acknow
    -0.64
     tyr
    -0.60
    "))
    -0.57
     happ
    -0.57
    aughed
    -0.56
     Defin
    -0.52
    SourceFile
    -0.52
    estern
    -0.52
     doesnt
    -0.49
    equality
    -0.49
    POSITIVE LOGITS
    ,
    1.00
    ,.
    0.96
    _.
    0.91
    .
    0.89
    .,
    0.87
    *,
    0.85
    !,
    0.84
    *.
    0.83
    .[
    0.78
    %.
    0.76
    Act Density 0.723%

    No Known Activations