INDEX
    Explanations

    references to TV shows and their characteristics

    New Auto-Interp
    Negative Logits
     movie
    -0.78
     film
    -0.71
     movies
    -0.69
     película
    -0.63
     filme
    -0.61
     films
    -0.56
    film
    -0.56
    Movie
    -0.56
    movie
    -0.56
    電影
    -0.55
    POSITIVE LOGITS
     houſe
    0.87
     ſtate
    0.83
     ſche
    0.79
     pleaſure
    0.78
     ſtre
    0.77
     staffel
    0.77
     raiſ
    0.77
     purpoſe
    0.76
     itſelf
    0.75
     iſt
    0.74
    Act Density 0.087%

    No Known Activations