INDEX
    Explanations

    phrases related to legal and political contexts

    special characters or unique symbols

    New Auto-Interp
    Negative Logits
     Dupl
    -0.72
    wagen
    -0.70
     Farn
    -0.68
     Kitt
    -0.65
    izabeth
    -0.65
     sacrific
    -0.65
     conduc
    -0.65
     destro
    -0.63
     Eisen
    -0.63
    itaire
    -0.62
    POSITIVE LOGITS
    ª
    1.13
    Ĵ
    1.09
    ¹
    0.99
    ł
    0.97
    IJ
    0.96
    ij
    0.95
    ¼
    0.92
    ı
    0.91
    ³
    0.90
    Ĥ
    0.90
    Act Density 0.103%

    No Known Activations