INDEX
Explanations
references to specific individuals or their works
New Auto-Interp
Negative Logits
azzi
-0.17
indy
-0.16
iT
-0.15
asmus
-0.15
ÙĪØ§Ø¡
-0.15
itet
-0.14
.ai
-0.14
TT
-0.14
amarin
-0.14
enheim
-0.14
POSITIVE LOGITS
wig
0.20
ger
0.19
Lad
0.17
heck
0.16
ève
0.15
olph
0.15
ouce
0.15
-addon
0.14
rido
0.14
Down
0.14
Activations Density 0.013%