INDEX
    Explanations

    references to linear relationships and equations in mathematical contexts

    New Auto-Interp
    Negative Logits
    olo
    -0.17
    a
    -0.16
    agal
    -0.15
    au
    -0.15
    ome
    -0.14
    astr
    -0.14
    iw
    -0.14
     margin
    -0.14
     Bol
    -0.14
     Kiss
    -0.14
    POSITIVE LOGITS
    ichier
    0.17
    ized
    0.16
    nez
    0.16
    atica
    0.16
    WindowTitle
    0.15
    affen
    0.15
    rod
    0.14
     èĩªåĬ¨çĶŁæĪIJ
    0.14
    áty
    0.14
    onymous
    0.14
    Act Density 0.033%

    No Known Activations