INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    oris
    -0.67
    ãĥī
    -0.58
    irs
    -0.56
    ////////
    -0.55
    INGS
    -0.54
    omas
    -0.54
     poses
    -0.54
    ffen
    -0.54
     Cosponsors
    -0.54
    arest
    -0.53
    POSITIVE LOGITS
     thirty
    1.09
     twenty
    1.06
     five
    1.05
     fifteen
    1.05
     decade
    1.03
     six
    1.02
     four
    1.01
     three
    1.00
     forty
    0.98
     half
    0.96
    Act Density 0.071%

    No Known Activations