INDEX
    Explanations

    mentions of significant individuals and their professional or personal relationships

    New Auto-Interp
    Negative Logits
    !).
    -0.34
     ).
    -0.31
    }.
    -0.31
     }.
    -0.30
    ?).
    -0.29
    !.
    -0.28
    !!.
    -0.28
    `.
    -0.28
     ].
    -0.28
    ”.
    -0.28
    POSITIVE LOGITS
    .,↵
    0.36
    ,↵
    0.35
    ..↵
    0.32
    0.30
    .'↵
    0.29
    .*↵
    0.28
    /↵
    0.27
    '↵
    0.26
    .↵
    0.26
    ï¼Į↵
    0.25
    Act Density 0.273%

    No Known Activations