INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Slinky
    -0.07
    lat
    -0.07
    877
    -0.07
     стол
    -0.07
     Bilim
    -0.06
    	def
    -0.06
    яття
    -0.06
    อำนวย
    -0.06
    573
    -0.06
    668
    -0.06
    POSITIVE LOGITS
     network
    0.13
     networks
    0.11
     Network
    0.10
    network
    0.10
     Networks
    0.09
    Network
    0.09
     NETWORK
    0.09
    NETWORK
    0.08
     NN
    0.08
    .Network
    0.08
    Act Density 0.032%

    No Known Activations