INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    eseorang
    -0.72
     ſtate
    -0.72
     diſt
    -0.71
    脚注の使い方
    -0.70
     Efq
    -0.70
     myſelf
    -0.68
     Monfieur
    -0.68
     purpoſe
    -0.68
     ainfi
    -0.67
    verläs
    -0.66
    POSITIVE LOGITS
     curiosity
    1.16
     curious
    1.11
    curious
    0.91
    curios
    0.88
     Curiosity
    0.88
     curios
    0.78
    SequentialGroup
    0.76
     bezeichneter
    0.75
     curiosidad
    0.73
    Curious
    0.73
    Act Density 0.131%

    No Known Activations