INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    
    -0.65
    BASEPATH
    -0.64
     gynhyrchwyd
    -0.63
     <<<<<<<<<<<<<<
    -0.56
    ährlich
    -0.55
    )\}$
    -0.52
    decer
    -0.52
     vetorial
    -0.51
    __':
    
    -0.51
    ategorien
    -0.51
    POSITIVE LOGITS
     sp
    0.65
    ia
    0.60
    ByExample
    0.56
     Sp
    0.55
     surla
    0.52
    ites
    0.51
    ite
    0.51
    sp
    0.50
    wife
    0.50
     muualla
    0.50
    Act Density 0.002%

    No Known Activations