INDEX
Explanations
references to the Iron Man character and associated terminology
New Auto-Interp
Negative Logits
ourn
-0.17
led
-0.16
á»Ĩ
-0.16
enna
-0.14
uteur
-0.14
jev
-0.14
iva
-0.14
uries
-0.14
liers
-0.14
ivan
-0.14
POSITIVE LOGITS
ically
0.26
ore
0.25
mong
0.25
iron
0.23
Iron
0.22
workers
0.22
iron
0.22
Iron
0.22
works
0.20
cl
0.20
Activations Density 0.009%