INDEX
Explanations
text related to historical figures and events
New Auto-Interp
Negative Logits
ufact
-1.08
kefeller
-0.95
gren
-0.92
manship
-0.92
senal
-0.91
zona
-0.86
ongyang
-0.86
bottleneck
-0.85
mable
-0.84
neighb
-0.84
POSITIVE LOGITS
pmwiki
1.05
âĨij
1.01
References
0.97
][
0.92
].
0.89
...]
0.88
Sources
0.88
],"
0.87
"]
0.86
]
0.84
Activations Density 0.206%