INDEX
Explanations
references to iron, both in terms of the material itself and metaphorical uses of the term
New Auto-Interp
Negative Logits
Isles
-0.15
umes
-0.15
ataire
-0.15
McD
-0.15
usk
-0.15
ãģĦãģ®
-0.15
пи
-0.14
ulfilled
-0.14
ervice
-0.14
uw
-0.13
POSITIVE LOGITS
enschaft
0.17
wer
0.16
abwe
0.16
vore
0.15
OOK
0.14
odus
0.14
ابÛĮ
0.14
omin
0.14
_ALIGNMENT
0.14
pg
0.14
Activations Density 0.009%