INDEX
    Explanations

    references to abstract ideas or theoretical concepts

    New Auto-Interp
    Negative Logits
     resistenza
    -0.66
     hår
    -0.66
    ásban
    -0.65
    abon
    -0.62
    WriteTo
    -0.59
     Corro
    -0.57
    :^(
    -0.56
     skrift
    -0.56
    rgba
    -0.56
     ederim
    -0.56
    POSITIVE LOGITS
     concepts
    2.18
     concept
    2.14
     CONCEPT
    1.88
     Concept
    1.88
    Concept
    1.84
     Concepts
    1.80
    concept
    1.79
    Concepts
    1.74
    concepts
    1.68
     CONCEPTS
    1.57
    Act Density 0.062%

    No Known Activations