INDEX
    Explanations

    physical states and actions

    New Auto-Interp
    Negative Logits
    批评
    0.35
     नैतिक
    0.35
     경제
    0.33
     gouvernement
    0.32
    观念
    0.32
    权力
    0.32
     ఆర్థిక
    0.31
    0.31
     პროგრამ
    0.31
     юриди
    0.30
    POSITIVE LOGITS
     underneath
    0.40
     trapped
    0.38
     convex
    0.38
     melt
    0.37
     bulb
    0.37
     clump
    0.37
     surface
    0.36
     wedge
    0.36
     crushed
    0.36
     underside
    0.36
    Act Density 0.306%

    No Known Activations