INDEX
    Explanations

    information about person appearances in a film

    New Auto-Interp
    Negative Logits
    ustomed
    -0.78
    án
    -0.77
    âĢij
    -0.76
    20439
    -0.74
    obil
    -0.74
    bitious
    -0.74
    etermined
    -0.73
    ornings
    -0.72
     ÂŃ
    -0.71
    ensable
    -0.70
    POSITIVE LOGITS
     kinda
    1.08
     anyways
    1.05
     shitty
    1.02
     lol
    1.00
     devs
    0.99
     stupidity
    0.98
     fucking
    0.95
     bullshit
    0.94
     idiots
    0.93
     stupid
    0.93
    Act Density 1.482%

    No Known Activations