INDEX
Explanations
references to titles, such as movies or books, ending with a punctuation mark
sentence-ending punctuation, indicating the completion of thoughts or statements
New Auto-Interp
Negative Logits
izoph
-0.64
undet
-0.62
defe
-0.61
disemb
-0.60
footing
-0.59
monop
-0.58
ikuman
-0.58
unch
-0.58
ozo
-0.57
predec
-0.57
POSITIVE LOGITS
Additionally
0.96
Moreover
0.94
Needless
0.93
Furthermore
0.92
Such
0.90
Similarly
0.89
Likewise
0.85
Meanwhile
0.85
Naturally
0.83
Their
0.83
Activations Density 1.086%