INDEX
Explanations
strings that start with "AY" followed by either other letters or numbers
references to specific days, particularly 'day' and related terms
New Auto-Interp
Negative Logits
ãĥ¯
-0.76
hered
-0.73
arium
-0.69
fed
-0.64
meg
-0.61
Scha
-0.61
itia
-0.60
priv
-0.58
Mamm
-0.58
Dresden
-0.58
POSITIVE LOGITS
AY
1.43
DAY
1.09
WARD
1.05
OUT
1.02
NESS
0.97
GOODMAN
0.95
ER
0.92
ARY
0.91
YY
0.90
yip
0.89
Activations Density 0.003%