INDEX
Explanations
emotions and states of mind associated with affection and remorse
New Auto-Interp
Negative Logits
atical
-0.18
ATIVE
-0.17
cope
-0.16
ave
-0.15
Lite
-0.15
ful
-0.15
inic
-0.15
FUL
-0.15
.generic
-0.15
ativnÃŃ
-0.15
POSITIVE LOGITS
Ñģки
0.19
ally
0.18
aqu
0.17
empor
0.17
ograd
0.16
ingleton
0.16
AILY
0.16
gly
0.16
andy
0.15
atab
0.15
Activations Density 0.084%