INDEX
Explanations
references to notable individuals and their connections in the context of film and politics
New Auto-Interp
Negative Logits
æĭĽ
-0.17
oud
-0.16
lez
-0.15
lef
-0.15
rå
-0.15
vida
-0.14
ULSE
-0.14
ìĶ
-0.14
loud
-0.13
strt
-0.13
POSITIVE LOGITS
eventual
0.17
bek
0.17
future
0.15
later
0.15
enler
0.15
lj
0.15
ëĭ¹ìĭľ
0.14
gest
0.14
flatten
0.14
бÑĥдÑĥÑī
0.13
Activations Density 0.176%