INDEX
    Explanations

    comparative phrases and specific examples of concepts

    New Auto-Interp
    Negative Logits
     rê
    -0.48
    גון
    -0.47
     cientí
    -0.46
     raccol
    -0.45
     např
    -0.44
    pecies
    -0.43
     textos
    -0.43
     cuci
    -0.43
     like
    -0.43
     telles
    -0.43
    POSITIVE LOGITS
     Houſe
    0.88
     houſe
    0.86
     myſelf
    0.81
    ſelf
    0.81
     poffe
    0.80
     chofe
    0.79
    ſelves
    0.78
     itſelf
    0.78
     Jefus
    0.76
     raiſ
    0.75
    Act Density 0.107%

    No Known Activations