INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     affili
    -0.07
     Kite
    -0.07
     arranger
    -0.07
     gusto
    -0.07
     Haf
    -0.07
    ייכ
    -0.07
    -0.07
     derivatives
    -0.07
     další
    -0.07
    POSITIVE LOGITS
     cylinders
    0.11
     cylinder
    0.10
     Cylinder
    0.10
     cylindrical
    0.10
    0.09
     slabs
    0.09
     trunks
    0.09
     cilind
    0.09
    ylinder
    0.08
     slab
    0.08
    Act Density 0.004%

    No Known Activations