INDEX
    Explanations

    self-preservation, minimizing losses

    New Auto-Interp
    Negative Logits
    یم
    0.52
    ere
    0.47
    ie
    0.46
     mumkin
    0.45
    Soc
    0.44
     bea
    0.43
     possible
    0.43
     l
    0.43
    yla
    0.42
    en
    0.42
    POSITIVE LOGITS
     ресторан
    0.61
    лизи
    0.58
    酒店
    0.56
    蛋白質
    0.52
    НЕ
    0.51
    SHOP
    0.49
    污水
    0.49
    收费
    0.48
    𝘢
    0.48
     restaurante
    0.48
    Act Density 0.003%

    No Known Activations