INDEX
    Explanations

    phrases indicating preferences or choices

    New Auto-Interp
    Negative Logits
     russes
    -0.73
     poffe
    -0.70
     Reſ
    -0.70
     itſelf
    -0.70
    ſelf
    -0.69
     houſe
    -0.68
     himſelf
    -0.66
    käse
    -0.66
     Conſ
    -0.65
     alve
    -0.64
    POSITIVE LOGITS
     about
    1.75
     ABOUT
    1.71
    ABOUT
    1.63
     About
    1.56
     abt
    1.48
    About
    1.47
    bout
    1.40
    about
    1.40
    Bout
    1.35
     Bout
    1.26
    Act Density 0.114%

    No Known Activations