INDEX
Explanations
numbers or numerical patterns, specifically those indicating a specific value or statistic
numeric values and percentages
New Auto-Interp
Negative Logits
phrine
-0.78
Flavoring
-0.69
¥µ
-0.64
raltar
-0.61
Parade
-0.61
ourke
-0.60
ilant
-0.59
ions
-0.59
encers
-0.59
streamed
-0.59
POSITIVE LOGITS
00
0.88
ielding
0.81
mares
0.75
50
0.71
wagen
0.71
hof
0.71
eenth
0.70
atural
0.70
ield
0.69
itely
0.69
Activations Density 0.032%