INDEX
    Explanations

    instances of notable public figures and their interactions in entertainment contexts

    New Auto-Interp
    Negative Logits
     seldom
    -0.16
    ayet
    -0.15
    OLID
    -0.14
     Maritime
    -0.14
    hed
    -0.14
    ikel
    -0.13
    umblr
    -0.13
    èģļ
    -0.13
    Withdraw
    -0.13
     Withdraw
    -0.13
    POSITIVE LOGITS
     lip
    0.27
     hilar
    0.25
     ser
    0.24
     imperson
    0.22
    lip
    0.20
     Lip
    0.20
     prank
    0.19
     belts
    0.18
     kara
    0.18
     parody
    0.17
    Act Density 0.167%

    No Known Activations