INDEX
Explanations
numeric values or mentions of time-related phrases
New Auto-Interp
Negative Logits
amus
-0.17
Kirby
-0.16
ibel
-0.15
uman
-0.14
гÑĥб
-0.14
rift
-0.14
esse
-0.14
åĨĴ
-0.14
Geg
-0.13
ardy
-0.13
POSITIVE LOGITS
fold
0.17
orra
0.16
Challenge
0.15
dni
0.14
Scre
0.14
ToOne
0.14
.extension
0.14
Austral
0.14
erras
0.13
æ³Ĭ
0.13
Activations Density 0.122%