INDEX
    Explanations

    counterparts

    New Auto-Interp
    Negative Logits
    Philadelphia
    -0.07
    -0.07
     smelled
    -0.06
     suspicious
    -0.06
     Tested
    -0.06
    -0.06
     metre
    -0.06
     mour
    -0.06
     Religious
    -0.06
     textView
    -0.06
    POSITIVE LOGITS
     counterparts
    0.13
     counterpart
    0.12
     predecessor
    0.07
     sexkontakte
    0.07
    ique
    0.07
     사이트
    0.06
     successor
    0.06
    иф
    0.06
    0.06
    рут
    0.06
    Act Density 0.009%

    No Known Activations