INDEX
    Explanations

    melted state

    New Auto-Interp
    Negative Logits
     hot
    -1.02
     cup
    -0.92
     Hot
    -0.81
     Cup
    -0.76
    hot
    -0.73
    Hot
    -0.73
     Cups
    -0.59
    ist
    -0.58
    脚注の使い方
    -0.57
     cups
    -0.56
    POSITIVE LOGITS
     pleaſure
    0.65
     myſelf
    0.64
     ſta
    0.63
    toid
    0.63
     ſtate
    0.62
     ſhould
    0.60
    enic
    0.60
     Chriſt
    0.59
     élé
    0.59
     theſe
    0.59
    Act Density 0.149%

    No Known Activations