INDEX
Explanations
strings of numbers with a specific format, likely referring to technical specifications
references to numerical values or specifications
New Auto-Interp
Negative Logits
tremend
-0.94
olicy
-0.84
awaru
-0.84
isode
-0.82
atem
-0.82
enhagen
-0.81
atre
-0.79
uppet
-0.79
andise
-0.78
ĺħ
-0.77
POSITIVE LOGITS
384
1.30
6666
1.15
th
0.98
eenth
0.89
07
0.87
66666666
0.85
09
0.84
340
0.84
37
0.84
66
0.82
Activations Density 0.027%