INDEX
Explanations
phrases indicating the presence of individuals and relationships
New Auto-Interp
Negative Logits
schlag
-0.33
ứ
-0.33
croce
-0.33
cheid
-0.32
はず
-0.31
бы
-0.31
')")
-0.30
пи
-0.30
coated
-0.29
way
-0.29
POSITIVE LOGITS
interested
0.91
considering
0.82
thinking
0.77
wondering
0.77
unsure
0.76
planning
0.75
unfamiliar
0.75
serious
0.72
interested
0.71
feeling
0.71
Activations Density 0.217%