INDEX
    Explanations

    phrases indicating well-established or recognized concepts

    New Auto-Interp
    Negative Logits
     Frey
    -0.71
    גרת
    -0.69
    Frey
    -0.63
    ുടെ
    -0.62
     Julie
    -0.62
    yczą
    -0.60
     Cowper
    -0.60
    תוך
    -0.59
     Matti
    -0.58
    EI
    -0.58
    POSITIVE LOGITS
     known
    2.22
     Known
    2.07
     KNOWN
    2.02
    known
    2.00
    Known
    1.98
    KNOWN
    1.81
     connue
    1.41
     conocido
    1.36
     conocida
    1.35
     connu
    1.31
    Act Density 0.091%

    No Known Activations