INDEX
    Explanations

    popular media franchises

    New Auto-Interp
    Negative Logits
    <unused423>
    0.52
    <unused364>
    0.45
    <unused655>
    0.45
    y
    0.44
     اتصال
    0.44
    <unused584>
    0.44
    <unused1047>
    0.43
    <unused1016>
    0.43
    <unused1740>
    0.42
    <unused355>
    0.42
    POSITIVE LOGITS
    -
    0.45
     creatures
    0.39
    '
    0.38
    Edit
    0.36
    0.36
     villains
    0.36
     chibi
    0.36
     characters
    0.34
     costume
    0.34
     inhabitants
    0.33
    Act Density 0.863%

    No Known Activations