INDEX
    Explanations

    descriptive adjectives and nouns

    New Auto-Interp
    Negative Logits
    किसी
    0.46
     apabila
    0.46
    Essa
    0.45
    ուն
    0.44
     بعض
    0.42
    Su
    0.42
     عندها
    0.41
    بعض
    0.41
     una
    0.41
    una
    0.41
    POSITIVE LOGITS
     Abs
    0.54
     Demonstr
    0.48
     Demonstration
    0.47
     Hypot
    0.43
     Traum
    0.43
    леко
    0.42
     Asc
    0.42
     Demonstrate
    0.42
     Elimin
    0.41
     Dist
    0.40
    Act Density 0.008%

    No Known Activations