INDEX
    Explanations

    phrases related to making sense or rationality

    New Auto-Interp
    Negative Logits
    _Impl
    -0.18
    .scalablytyped
    -0.17
    uma
    -0.14
    tps
    -0.14
     Sophia
    -0.13
     fflush
    -0.13
    ãĥ³ãĤ¿
    -0.13
     Suff
    -0.13
    £i
    -0.13
    akah
    -0.13
    POSITIVE LOGITS
     sense
    0.59
    sense
    0.44
     Sense
    0.41
     sentido
    0.36
    Sense
    0.35
     sene
    0.35
     senses
    0.34
     sens
    0.32
    SEN
    0.32
     scn
    0.31
    Act Density 0.030%

    No Known Activations