INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     faſt
    -0.83
     ſever
    -0.79
     eſſ
    -0.76
     ſche
    -0.75
     daysTop
    -0.72
    <unused23>
    -0.71
    Tikang
    -0.71
    <unused17>
    -0.71
    <unused14>
    -0.71
    [@BOS@]
    -0.71
    POSITIVE LOGITS
     www
    0.77
    www
    0.73
    ://
    0.69
     website
    0.46
    :\/\/
    0.40
     WWW
    0.37
    <bos>
    0.37
    website
    0.37
    ="//
    0.36
    Www
    0.35
    Act Density 0.005%

    No Known Activations