INDEX
Explanations
phrases indicating personal experiences or actions
verbs indicating actions taken by the speaker
New Auto-Interp
Negative Logits
wayne
-0.68
Individuals
-0.64
vation
-0.64
¯¯¯¯
-0.64
aic
-0.63
ween
-0.62
bia
-0.62
Entry
-0.62
Known
-0.61
bies
-0.60
POSITIVE LOGITS
myself
1.00
EStream
0.79
é¾
0.77
eah
0.67
cyclopedia
0.66
sugg
0.66
pict
0.66
undai
0.65
looph
0.65
cffff
0.63
Activations Density 0.290%