INDEX
Explanations
phrases related to financial burdens or expenses
New Auto-Interp
Negative Logits
BeginInit
-0.45
Kanpo
-0.41
externi
-0.40
المعيارى
-0.39
)\}$
-0.36
OGND
-0.35
>):
-0.34
)"),
-0.33
''');
-0.31
就好
-0.31
POSITIVE LOGITS
nowhere
1.02
sight
0.84
bounds
0.80
whack
0.75
reach
0.72
necessity
0.71
harms
0.69
sync
0.67
0.66
Nowhere
0.63
Activations Density 0.100%