INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _DA
    -0.07
    -0.07
    ПО
    -0.06
     thị
    -0.06
    P
    -0.06
    unkt
    -0.06
    PV
    -0.06
     erste
    -0.06
     önem
    -0.06
    Chron
    -0.06
    POSITIVE LOGITS
     bits
    0.24
     Bits
    0.15
    Bits
    0.11
     BITS
    0.10
     Bit
    0.09
     Bella
    0.08
     Beit
    0.08
    _bits
    0.08
    BITS
    0.08
    <bits
    0.07
    Act Density 0.009%

    No Known Activations