INDEX
Explanations
expressions of perception or opinion
New Auto-Interp
Negative Logits
Exer
-0.43
Hosp
-0.43
tox
-0.40
Orig
-0.40
Northam
-0.40
PDR
-0.40
oxalate
-0.39
Hospital
-0.39
punch
-0.39
화
-0.39
POSITIVE LOGITS
seem
1.42
seemed
1.35
seemed
1.33
Seems
1.30
seems
1.29
seems
1.26
Seems
1.26
Seem
1.21
seem
1.21
seeming
1.13
Activations Density 0.145%