INDEX
    Explanations

    phrases related to actions or processes that involve taking, utilizing, or analyzing data

    New Auto-Interp
    Negative Logits
     houſe
    -0.88
     purpoſe
    -0.86
     itſelf
    -0.80
     myſelf
    -0.78
     raiſ
    -0.78
    IBLIO
    -0.78
     himſelf
    -0.77
    ſelves
    -0.76
     Theſe
    -0.75
     Houſe
    -0.75
    POSITIVE LOGITS
     taken
    1.10
    Taking
    1.08
     Taking
    1.08
     taking
    1.08
     take
    1.07
     takes
    0.99
    take
    0.96
    taken
    0.96
     Take
    0.95
     Taken
    0.93
    Act Density 0.159%

    No Known Activations