INDEX
Explanations
periods followed by descriptions or conclusions
sentences that indicate conclusions or statements of fact
New Auto-Interp
Negative Logits
hoe
-0.72
inactive
-0.63
ilated
-0.61
ucer
-0.61
liest
-0.60
isine
-0.60
reet
-0.59
eteen
-0.59
intermediate
-0.58
userc
-0.58
POSITIVE LOGITS
Flavoring
1.07
Whereas
1.06
Secondly
1.05
Thankfully
0.99
Fortunately
0.95
Instead
0.95
Certainly
0.95
Moreover
0.94
Hence
0.94
Luckily
0.94
Activations Density 0.681%