INDEX
Explanations
specific patterns and endings in words related to topics of finance or transactions
New Auto-Interp
Negative Logits
y
-0.20
halt
-0.18
er
-0.18
eri
-0.18
hur
-0.18
erse
-0.18
erd
-0.17
erb
-0.17
erland
-0.17
hoo
-0.17
POSITIVE LOGITS
ted
0.41
ters
0.39
ting
0.34
tings
0.33
tes
0.31
ts
0.28
ta
0.27
ti
0.25
ropolis
0.24
ty
0.24
Activations Density 0.125%