INDEX
    Explanations

    important statements or justifications

    phrases indicating significant consequences or importance

    New Auto-Interp
    Negative Logits
    Fuck
    -0.72
    Yep
    -0.72
    Damn
    -0.69
    Awesome
    -0.69
    ulhu
    -0.67
    Enjoy
    -0.67
     finished
    -0.65
     haha
    -0.65
     Finish
    -0.64
     hopped
    -0.64
    POSITIVE LOGITS
     Suppose
    0.90
     embodiments
    0.87
     economists
    0.82
     methodological
    0.81
     proponents
    0.81
     empirical
    0.80
     theorists
    0.80
     policymakers
    0.79
    typically
    0.79
     practitioners
    0.79
    Act Density 0.916%

    No Known Activations