INDEX
Explanations
words that convey emotional quality or evaluative judgment
New Auto-Interp
Negative Logits
ა
-0.52
<eos>
-0.48
wer
-0.45
or
-0.44
lah
-0.44
generally
-0.44
-0.43
inc
-0.43
ime
-0.42
்க
-0.42
POSITIVE LOGITS
pleaſure
1.34
purpoſe
1.34
myſelf
1.34
raiſ
1.31
themſelves
1.29
houſe
1.28
itſelf
1.27
Efq
1.27
Monfieur
1.25
ſeveral
1.25
Activations Density 0.973%