INDEX
    Explanations

    color codes in various formats

    New Auto-Interp
    Negative Logits
    "])
    
    -1.13
    ()]
    
    -1.08
     Baillargeon
    -1.04
     propOrder
    -1.02
     ویکی‌پدی
    -0.99
    !")
    
    -0.97
    AutoScale
    -0.97
    ()]);
    -0.95
    "]
    
    -0.94
    "]);
    
    -0.91
    POSITIVE LOGITS
     '#
    0.98
    :'#
    0.94
    ="#
    0.93
    ='#
    0.83
     "#
    0.81
    ('#
    0.80
    '#
    0.78
    "#
    0.77
    :"#
    0.75
    ','#
    0.71
    Act Density 0.018%

    No Known Activations