INDEX
Explanations
names and words containing "er" or "es" letters with high activation
plural noun forms and certain suffixes that indicate various attributes
New Auto-Interp
Negative Logits
EDITION
-0.64
DERR
-0.64
é¾į
-0.63
upon
-0.63
minded
-0.62
avorite
-0.60
pants
-0.60
å§«
-0.60
tunes
-0.59
DOWN
-0.57
POSITIVE LOGITS
ching
0.87
hiba
0.82
iency
0.82
cil
0.78
ience
0.77
idian
0.75
cher
0.73
itability
0.72
eto
0.71
isure
0.71
Activations Density 0.139%