INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -ab
    -0.09
     MV
    -0.09
     mav
    -0.09
     abras
    -0.09
    -av
    -0.09
    _ab
    -0.08
     ван
    -0.08
     vam
    -0.08
     hasn
    -0.08
     mp
    -0.08
    POSITIVE LOGITS
    (angle
    0.20
     angle
    0.16
     Angle
    0.16
    Angle
    0.15
    angle
    0.15
     angles
    0.14
    .angle
    0.14
    _angle
    0.13
    angles
    0.13
     Ang
    0.13
    Act Density 0.038%

    No Known Activations