INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    此同时
    0.69
    0.68
    0.67
    0.65
    0
    0.64
     quienes
    0.62
    我又
    0.59
    ת
    0.59
     możemy
    0.59
    ليه
    0.58
    POSITIVE LOGITS
    mbh
    0.60
    ösen
    0.51
     Encyclopædia
    0.51
    tsó
    0.50
     Descriptive
    0.50
    0.50
     excitations
    0.50
     tumors
    0.49
    /
    0.49
    (
    0.49
    Act Density 0.000%

    No Known Activations