INDEX
Explanations
references to numerical values, specifically large figures
New Auto-Interp
Negative Logits
MSN
-0.79
href
-0.76
lez
-0.75
stead
-0.73
cho
-0.72
ration
-0.69
ffield
-0.67
imen
-0.67
elly
-0.65
ravings
-0.65
POSITIVE LOGITS
mAh
1.05
8000
1.03
8000
0.89
6000
0.89
6000
0.87
4000
0.83
7000
0.82
5000
0.81
4000
0.81
9000
0.80
Activations Density 0.009%