INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     የማይ
    0.43
    ouvrir
    0.40
     فتحه
    0.39
     membuka
    0.39
    ኘት
    0.38
    (()
    0.38
    ivism
    0.37
     открыть
    0.37
    opensource
    0.37
    0.37
    POSITIVE LOGITS
    model
    0.61
    Model
    0.59
     model
    0.59
     Model
    0.55
     Moz
    0.51
     Modell
    0.50
     modèle
    0.49
     модели
    0.49
    模型
    0.48
     MODEL
    0.47
    Act Density 0.020%

    No Known Activations