INDEX
Explanations
feeling states
feelings and states of being
New Auto-Interp
Negative Logits
以降
0.40
사고
0.37
વાહી
0.37
अभ्यर्थ
0.34
没想到
0.34
Enf
0.34
感兴趣
0.33
ionais
0.33
ilever
0.32
আয়াত
0.32
POSITIVE LOGITS
comfortable
0.89
obligated
0.86
compelled
0.85
guilty
0.84
obliged
0.81
betrayed
0.78
confortable
0.77
cheated
0.76
comforted
0.73
confident
0.72
Activations Density 0.091%