INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     shape
    -1.97
     Shape
    -1.78
    shape
    -1.69
    Shape
    -1.64
     shapes
    -1.53
     SHAPE
    -1.43
    SHAPE
    -1.34
     shaped
    -1.29
     Shapes
    -1.28
    shapes
    -1.20
    POSITIVE LOGITS
     '\\;'
    0.67
    TagHelper
    0.62
    schild
    0.59
    transQ
    0.53
     recensement
    0.52
    hoeddwyd
    0.51
    andom
    0.49
    0.48
    žil
    0.48
    jano
    0.48
    Act Density 0.064%

    No Known Activations