INDEX
    Explanations

    phenomena and their descriptions

    New Auto-Interp
    Negative Logits
     putting
    0.50
    IF
    0.47
     scooters
    0.47
     childcare
    0.44
     sitting
    0.44
     conversions
    0.43
     IF
    0.43
     removing
    0.43
     converting
    0.43
    P
    0.41
    POSITIVE LOGITS
     Literatur
    0.51
     hasattr
    0.47
     fenómenos
    0.47
    0.43
     María
    0.43
     fenómeno
    0.42
    0.42
    ޞ
    0.41
     Pérez
    0.41
    ława
    0.41
    Act Density 0.001%

    No Known Activations