INDEX
    Explanations

    specific nouns and concepts

    New Auto-Interp
    Negative Logits
    ק
    0.47
     yogurt
    0.46
     curd
    0.46
     shrimp
    0.45
     نہیں۔
    0.43
    dict
    0.42
    BOOL
    0.42
     copying
    0.42
     meringue
    0.41
     triggered
    0.41
    POSITIVE LOGITS
    anın
    0.47
    arów
    0.46
    ostęp
    0.45
     vän
    0.43
    igheder
    0.42
    ઓની
    0.42
     있다는
    0.41
    中的
    0.41
     записи
    0.41
    amot
    0.41
    Act Density 0.000%

    No Known Activations