INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tones
    -0.70
     Nanto
    -0.64
    >[
    -0.61
     depths
    -0.60
     distance
    -0.60
     disturbances
    -0.60
     abyss
    -0.60
     swast
    -0.59
     loyal
    -0.59
     foss
    -0.59
    POSITIVE LOGITS
    HB
    0.80
    783
    0.80
    è¦ļéĨĴ
    0.79
    ĨĴ
    0.78
    20439
    0.77
    393
    0.76
    753
    0.73
    793
    0.73
    772
    0.72
    262
    0.72
    Act Density 0.056%

    No Known Activations