INDEX
    Explanations

    phrases that discuss perspectives and perceptions of the world and social issues

    New Auto-Interp
    Negative Logits
    onta
    -0.16
    assin
    -0.15
    aku
    -0.14
    avra
    -0.14
    onto
    -0.14
    ippy
    -0.14
    913
    -0.14
    ATCH
    -0.13
    wheel
    -0.13
     temp
    -0.13
    POSITIVE LOGITS
     differently
    0.33
     through
    0.28
    Through
    0.26
    through
    0.25
     Through
    0.25
     THROUGH
    0.22
     thru
    0.21
     través
    0.19
     af
    0.19
    _through
    0.19
    Act Density 0.144%

    No Known Activations