INDEX
Explanations
words related to various types of start or beginning, as well as chronological markers
words indicating the start or progression of events or concepts
New Auto-Interp
Negative Logits
Ag
-0.67
bert
-0.67
weed
-0.65
ANA
-0.65
Matrix
-0.64
rehens
-0.61
ETF
-0.58
onz
-0.58
parency
-0.58
agg
-0.58
POSITIVE LOGITS
etheless
0.74
rul
0.70
IU
0.69
theirs
0.67
hers
0.62
thirsty
0.61
unprotected
0.61
KE
0.59
substituted
0.58
destro
0.58
Activations Density 1.085%