INDEX
Explanations
references to the concept of duality, contrast, or variations within a category
instances of the suffix "ile" in words
New Auto-Interp
Negative Logits
ĸļ
-0.96
olver
-0.85
iversal
-0.85
ĵĺ
-0.85
axter
-0.83
esson
-0.82
icter
-0.77
icum
-0.77
itivity
-0.76
ington
-0.76
POSITIVE LOGITS
mma
1.05
tto
1.05
zza
0.81
tta
0.77
tt
0.77
ttes
0.74
gged
0.73
vich
0.73
tsky
0.71
vel
0.68
Activations Density 0.021%