INDEX
    Explanations

    scientific/technical texts

    New Auto-Interp
    Negative Logits
    Coefficient
    -0.07
    ��
    -0.06
    فاده
    -0.06
    .Ticks
    -0.06
     Prints
    -0.06
     zach
    -0.06
    personal
    -0.06
    _INTR
    -0.06
    PRS
    -0.06
     τρό
    -0.06
    POSITIVE LOGITS
     transgender
    0.07
     transmitting
    0.07
    öh
    0.07
     transported
    0.07
     moc
    0.06
     frivol
    0.06
    öğ
    0.06
     demise
    0.06
    sb
    0.06
    ğ
    0.06
    Act Density 0.009%

    No Known Activations