INDEX
    Explanations

    significant numerical quantities or ratios

    New Auto-Interp
    Negative Logits
     badass
    -0.23
     freaking
    -0.18
     impactful
    -0.18
     transitioning
    -0.17
     policymakers
    -0.17
     transformative
    -0.17
    -focused
    -0.16
    -esque
    -0.16
     nuanced
    -0.16
     policym
    -0.15
    POSITIVE LOGITS
     :-↵
    0.23
     ..........
    0.22
     ........
    0.21
     ................
    0.20
    :-
    0.19
     viz
    0.19
     ,(
    0.19
    â̦â̦â̦â̦
    0.19
     !!
    0.18
     :-
    0.18
    Act Density 0.808%

    No Known Activations