INDEX
Explanations
words with the suffix '-y' followed by a strong activation value, particularly 'y' itself
occurrences of the letter 'y'
New Auto-Interp
Negative Logits
raltar
-0.81
Examiner
-0.78
IBLE
-0.70
icably
-0.68
insula
-0.68
itures
-0.67
bernatorial
-0.65
ãĥ¯
-0.63
vou
-0.63
foss
-0.63
POSITIVE LOGITS
ield
1.02
ielding
0.99
Å«
0.92
aku
0.90
ank
0.86
olk
0.85
mbol
0.81
ikes
0.81
ng
0.80
STEM
0.80
Activations Density 0.055%