INDEX
Explanations
dates and times
punctuation marks and specific formatting symbols in text
New Auto-Interp
Negative Logits
ela
-0.86
elo
-0.79
ãĥ¯
-0.78
Tes
-0.77
TPS
-0.74
ba
-0.72
Tes
-0.71
Alic
-0.71
ula
-0.70
Kelvin
-0.70
POSITIVE LOGITS
11
1.31
11
1.21
111
1.01
111
0.94
117
0.94
112
0.94
Rah
0.93
November
0.90
116
0.89
1200
0.88
Activations Density 0.141%