INDEX
Explanations
words related to transitions or sequences of events
conjunctions and connecting words that indicate relationships between ideas
New Auto-Interp
Negative Logits
tremend
-0.87
hardest
-0.75
referen
-0.75
rare
-0.74
calming
-0.73
warr
-0.73
hard
-0.72
£ı
-0.71
recession
-0.71
exha
-0.71
POSITIVE LOGITS
iste
0.82
pad
0.81
imi
0.81
illian
0.81
lys
0.81
atis
0.80
tec
0.80
hess
0.80
pedia
0.80
vier
0.79
Activations Density 0.358%