INDEX
Explanations
the name "Mic" with varying levels of specificity (given by different activation values)
references to the character "Mic" in various contexts
New Auto-Interp
Negative Logits
ACTED
-0.72
Unified
-0.71
Dull
-0.67
IVES
-0.64
ENGTH
-0.63
ANGEL
-0.60
Spit
-0.60
ãĤĤ
-0.59
OUGH
-0.58
tenance
-0.58
POSITIVE LOGITS
roman
1.28
ropolitan
1.27
HAEL
1.22
helle
1.21
rones
1.15
illin
1.06
rom
1.05
onductor
1.04
olon
1.03
ron
1.03
Activations Density 0.050%