INDEX
    Explanations

    words and phrases associated with influential figures and their impact in various contexts

    New Auto-Interp
    Negative Logits
     itſelf
    -1.62
     houſe
    -1.55
     purpoſe
    -1.55
     pleaſure
    -1.53
     ſtate
    -1.49
     myſelf
    -1.44
     Houſe
    -1.42
     ſche
    -1.38
     iſt
    -1.37
     ſever
    -1.36
    POSITIVE LOGITS
     (
    1.16
    ,
    1.09
    1.07
    :
    0.99
    /
    0.97
     "
    0.96
    -
    0.95
     -
    0.92
    ?
    0.90
    ;
    0.89
    Act Density 7.348%

    No Known Activations