INDEX
Explanations
words or phrases related to comparisons or evaluations
references to philosophical concepts or ideas related to existence
New Auto-Interp
Negative Logits
anwhile
-0.66
Meanwhile
-0.58
enthusi
-0.58
Seym
-0.54
Vaugh
-0.54
aunted
-0.54
Hundreds
-0.54
PDATE
-0.53
oppable
-0.53
kefeller
-0.52
POSITIVE LOGITS
noun
1.14
)?
0.91
pronouns
0.90
abbre
0.89
verb
0.87
verbs
0.86
pronoun
0.86
adjective
0.85
adject
0.81
plur
0.77
Activations Density 0.963%