INDEX
Explanations
comparative descriptions
phrases that express thoughtful opinions or reflections
New Auto-Interp
Negative Logits
militar
-0.63
distrust
-0.61
Grimm
-0.59
irt
-0.59
Adams
-0.59
mistrust
-0.59
rus
-0.59
roma
-0.58
yon
-0.57
etheus
-0.57
POSITIVE LOGITS
anymore
0.77
articulate
0.73
coherent
0.71
myself
0.70
muster
0.70
words
0.69
describ
0.68
haha
0.67
stomach
0.67
peat
0.66
Activations Density 0.297%