INDEX
Explanations
expressions of sharing thoughts and perspectives from various individuals
New Auto-Interp
Negative Logits
amburger
-0.19
phas
-0.15
consequence
-0.15
reperc
-0.14
Bes
-0.14
uil
-0.14
aroo
-0.14
หม
-0.14
inden
-0.14
ORMAT
-0.14
POSITIVE LOGITS
thoughts
0.59
Thoughts
0.45
views
0.40
impressions
0.39
experiences
0.35
feelings
0.33
opinions
0.33
Thought
0.31
take
0.31
thought
0.31
Activations Density 0.120%