INDEX
Explanations
mentions of the name "Dave"
New Auto-Interp
Negative Logits
人çī©
-0.18
jit
-0.16
loh
-0.15
'gc
-0.14
rish
-0.14
orners
-0.14
ecture
-0.14
ulu
-0.14
Ñħод
-0.13
_MAN
-0.13
POSITIVE LOGITS
y
0.30
igh
0.23
ed
0.19
yh
0.17
IGH
0.16
edar
0.16
ÙĬد
0.16
amer
0.15
eder
0.15
yaw
0.15
Activations Density 0.005%