INDEX
Explanations
references to the character Iron Man or related terminology
New Auto-Interp
Negative Logits
stral
-0.15
ourn
-0.15
inders
-0.15
iva
-0.14
omor
-0.14
á»Ĩ
-0.14
liers
-0.14
еÑĩ
-0.14
ionario
-0.14
ibox
-0.14
POSITIVE LOGITS
ically
0.27
mong
0.26
ore
0.25
workers
0.23
iron
0.22
Maiden
0.21
cl
0.21
Iron
0.20
maiden
0.20
Iron
0.20
Activations Density 0.009%