INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -
    -0.71
     ver
    -0.54
     ti
    -0.52
     t
    -0.52
     di
    -0.49
     ta
    -0.48
     g
    -0.48
     j
    -0.47
     sa
    -0.47
    ?
    -0.47
    POSITIVE LOGITS
     itſelf
    1.09
     Efq
    1.02
     BoxFit
    1.00
     ſeveral
    0.96
     Theſe
    0.94
    ſelf
    0.94
     myſelf
    0.93
     photolibrary
    0.91
     CreateTagHelper
    0.91
     houſe
    0.91
    Act Density 0.009%

    No Known Activations