INDEX
    Explanations

    intellectual task that human

    New Auto-Interp
    Negative Logits
     depth
    0.41
    Vinyl
    0.38
    плат
    0.37
    0.37
    Dad
    0.36
    ponents
    0.36
    vinyl
    0.36
     distintas
    0.36
    心理
    0.36
    深度
    0.35
    POSITIVE LOGITS
     Metal
    0.62
     Metall
    0.56
     Slayer
    0.55
     Nuclear
    0.53
    Metal
    0.52
     NUCLEAR
    0.49
    METAL
    0.48
     Metals
    0.47
    0.47
     metals
    0.46
    Act Density 0.002%

    No Known Activations