INDEX
Explanations
the presence of words with specific vowel or consonant patterns
New Auto-Interp
Negative Logits
ortunately
-0.78
ĪĴ
-0.78
Bened
-0.77
«ĺ
-0.73
pta
-0.69
Pwr
-0.69
Ĥª
-0.66
Vie
-0.66
Ͻ
-0.65
quartered
-0.65
POSITIVE LOGITS
kids
1.14
Kids
1.11
arro
0.91
kid
0.89
bang
0.86
ARDS
0.86
illas
0.83
hawk
0.75
busters
0.74
apons
0.73
Activations Density 0.013%