INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pools
    -0.07
    Measure
    -0.07
    _listener
    -0.07
    -0.07
     titulo
    -0.07
     Diet
    -0.06
     dilemma
    -0.06
    /***/
    -0.06
    	head
    -0.06
    Tree
    -0.06
    POSITIVE LOGITS
     seaborn
    0.10
     sns
    0.08
     villagers
    0.06
    0.06
    lien
    0.06
    eturn
    0.06
    oya
    0.06
     vẫn
    0.05
     Γι
    0.05
     habil
    0.05
    Act Density 0.001%

    No Known Activations