INDEX
    Explanations

    phrases that describe characteristics and specifications of objects in a structured format

    New Auto-Interp
    Negative Logits
     hele
    -0.46
     hiç
    -0.42
    Invoke
    -0.42
     vůbec
    -0.42
    ULAR
    -0.41
     aller
    -0.41
    GO
    -0.41
    ous
    -0.39
    .
    -0.39
    ЛЕ
    -0.39
    POSITIVE LOGITS
     each
    1.23
     individually
    1.11
    EACH
    1.08
     Each
    1.08
    each
    1.06
    Each
    1.04
     EACH
    1.03
    Chaque
    1.00
     Efq
    0.99
     Majefty
    0.97
    Act Density 0.475%

    No Known Activations