INDEX
Explanations
economic inflation and evolution
New Auto-Interp
Negative Logits
област
0.46
Actions
0.40
transit
0.39
ınca
0.39
Transit
0.39
zás
0.38
Actions
0.38
transit
0.38
시민
0.38
Action
0.37
POSITIVE LOGITS
Behavior
0.51
behavior
0.47
behavior
0.43
variation
0.41
perception
0.41
調
0.41
evolution
0.40
pulver
0.40
percep
0.40
benchmark
0.40
Activations Density 0.007%