INDEX
    Explanations

    references to the "Harry Potter" series and related characters

    New Auto-Interp
    Negative Logits
     Meng
    -0.16
    NC
    -0.15
    808
    -0.14
     Nu
    -0.14
     Abstract
    -0.14
    ird
    -0.14
    388
    -0.14
    209
    -0.14
    GT
    -0.13
    IRD
    -0.13
    POSITIVE LOGITS
     Harry
    0.54
    Harry
    0.49
     Potter
    0.47
     Rowling
    0.40
     HP
    0.40
     wizard
    0.39
     Hogwarts
    0.37
     Dumbledore
    0.37
     Snape
    0.37
     Hermione
    0.36
    Act Density 0.022%

    No Known Activations