INDEX
    Explanations

    mentions of films and their attributes

    New Auto-Interp
    Negative Logits
    zw
    -0.17
    ecurity
    -0.16
    embre
    -0.15
     Abed
    -0.15
    .Toolkit
    -0.15
    окÑĢем
    -0.15
     triang
    -0.14
    áb
    -0.14
    haul
    -0.14
    .lesson
    -0.14
    POSITIVE LOGITS
     Tel
    0.23
     hero
    0.22
     tol
    0.22
     heroine
    0.21
     mass
    0.20
     keer
    0.20
     Mega
    0.20
     tel
    0.20
     interval
    0.20
     Hero
    0.20
    Act Density 0.011%

    No Known Activations