INDEX
Explanations
numerical data points and their relevance to key concepts and trends
New Auto-Interp
Negative Logits
outers
-0.15
óng
-0.14
лоп
-0.14
ÑĭÑĪ
-0.14
adamente
-0.14
ubar
-0.14
llib
-0.13
à¸ķะ
-0.13
å¯
-0.13
ÏĥÏĥ
-0.13
POSITIVE LOGITS
201
0.29
200
0.23
202
0.23
210
0.21
203
0.18
208
0.17
209
0.17
211
0.16
301
0.15
199
0.15
Activations Density 0.075%