INDEX
Explanations
proper names
occurrences of the suffix "ze"
New Auto-Interp
Negative Logits
glim
-0.92
acles
-0.81
inarily
-0.74
paced
-0.74
ials
-0.73
runner
-0.73
ially
-0.72
runners
-0.71
istrate
-0.71
ainment
-0.66
POSITIVE LOGITS
lda
1.22
zinski
1.19
ppelin
1.18
ÅĤ
0.93
leck
0.86
cki
0.85
zza
0.82
uble
0.81
ichen
0.81
pps
0.79
Activations Density 0.040%