INDEX
    Explanations

    phrases indicating quality or evaluation

    New Auto-Interp
    Negative Logits
     kasarigan
    -0.82
     bonté
    -0.74
     crutches
    -0.71
     fantasies
    -0.71
    ]")]
    -0.70
     miracles
    -0.70
     synergies
    -0.68
     nightmares
    -0.68
     oracles
    -0.67
     skyscrapers
    -0.67
    POSITIVE LOGITS
     number
    0.80
     amount
    0.77
     few
    0.73
     lot
    0.72
     degree
    0.72
     portion
    0.66
     range
    0.65
     way
    0.64
     list
    0.59
     sense
    0.59
    Act Density 0.492%

    No Known Activations