INDEX
    Explanations

    adding items or concepts

    New Auto-Interp
    Negative Logits
    Differences
    0.39
    şa
    0.39
    Differ
    0.37
     Differences
    0.37
    ષા
    0.37
     বণ্ট
    0.37
    াশ
    0.36
    σκεται
    0.36
     sagt
    0.36
    Example
    0.36
    POSITIVE LOGITS
     hinzu
    1.34
     added
    1.04
     доба
    1.04
     добавля
    1.02
     add
    1.00
     thêm
    0.97
     добавить
    0.96
     добав
    0.95
     hinzufügen
    0.94
     adicion
    0.93
    Act Density 0.049%

    No Known Activations