INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .dat
    -0.07
     жил
    -0.06
    iffies
    -0.06
    meaning
    -0.06
     şüph
    -0.06
    Correo
    -0.06
     temporal
    -0.06
    _CONV
    -0.06
    ilies
    -0.06
    genes
    -0.06
    POSITIVE LOGITS
    }`;↵
    0.08
    ')))
    0.08
    board
    0.07
     arch
    0.07
     });
    ↵
    0.07
    (""));↵
    0.07
    ;');↵
    0.06
    }}"
    0.06
    ']]]↵
    0.06
    SIZE
    0.06
    Act Density 0.006%

    No Known Activations