INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     successful
    -1.14
     success
    -1.06
    successful
    -1.05
     successes
    -0.95
     erfolgreiche
    -0.93
    success
    -0.87
    successfully
    -0.82
    
    -0.81
     erfolgre
    -0.79
    Success
    -0.79
    POSITIVE LOGITS
     intérieure
    0.68
     edelstahl
    0.65
    lorette
    0.63
    lula
    0.63
     köln
    0.62
     attribut
    0.61
    ossa
    0.61
     loob
    0.61
     Kassel
    0.60
    UNRELATED
    0.60
    Act Density 0.489%

    No Known Activations