INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
i
-0.38
-0.37
-0.33
works
-0.31
,
-0.30
.
-0.30
most
-0.30
Jo
-0.29
"
-0.28
to
-0.27
POSITIVE LOGITS
queſta
0.93
ſelf
0.88
ſta
0.84
$_(
0.82
パンチラ
0.80
zwiſchen
0.79
ðsíða
0.79
geſ
0.79
ſeine
0.77
dieſes
0.77
Activations Density 0.000%
No Known Activations
This feature has no known activations.