INDEX
Explanations
statements of personal reflection or opinion
past thoughts and beliefs
New Auto-Interp
Negative Logits
endblock
-0.33
XtraBars
-0.31
AsUp
-0.30
thanks
-0.29
{}'.-0.29
lucro
-0.29
zufolge
-0.28
дії
-0.28
gants
-0.28
ของคุณ
-0.28
POSITIVE LOGITS
thought
1.18
Thought
1.16
Thought
1.11
THOUGHT
1.10
thought
1.10
AssemblyTitle
0.82
dachte
0.74
tưởng
0.71
SequentialGroup
0.70
dacht
0.68
Activations Density 0.015%