INDEX
Explanations
various mix of letters and symbols with sporadic activations
abbreviations or shorthand representations, possibly related to identifiers or codes
New Auto-Interp
Negative Logits
today
-0.60
redibly
-0.54
ÂŃ
-0.54
shockingly
-0.51
reminder
-0.51
batter
-0.49
homicide
-0.49
INGTON
-0.48
opposite
-0.48
rented
-0.48
POSITIVE LOGITS
ZX
1.11
==
1.11
=/
1.06
=
1.00
Ul
0.99
XM
0.98
Ry
0.97
dq
0.96
/+
0.95
Iv
0.95
Activations Density 0.027%