INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (lib
    -0.09
     errands
    -0.08
    markers
    -0.08
    ynu
    -0.08
     supervised
    -0.08
     hábitos
    -0.08
    cripts
    -0.07
     quicker
    -0.07
    讲话
    -0.07
    _keywords
    -0.07
    POSITIVE LOGITS
     सुंदर
    0.17
     جمال
    0.16
     સુંદર
    0.15
     Schönheit
    0.15
     아름
    0.15
     beautiful
    0.15
     beauty
    0.15
     magnificent
    0.14
     красоты
    0.14
     breathtaking
    0.14
    Act Density 0.115%

    No Known Activations