INDEX
Explanations
numbers with some specific patterns
numerical values and specific codes or identifiers
New Auto-Interp
Negative Logits
igators
-0.75
ications
-0.72
uyomi
-0.71
ience
-0.70
Stage
-0.69
ition
-0.69
iment
-0.68
igious
-0.67
ist
-0.66
ioned
-0.66
POSITIVE LOGITS
09
1.06
07
0.99
05
0.96
059
0.96
08
0.95
06
0.95
04
0.91
089
0.90
090
0.90
03
0.90
Activations Density 0.052%