INDEX
Explanations
actions, insights, data, architect, manager, proposal
New Auto-Interp
Negative Logits
be
0.40
petroleum
0.35
all
0.32
government
0.32
the
0.31
human
0.31
ginseng
0.31
sweet
0.31
salary
0.31
stainless
0.31
POSITIVE LOGITS
те
0.43
של
0.42
والم
0.37
チ
0.37
០០
0.37
т
0.37
鿓
0.36
ของ
0.36
이나
0.35
및
0.35
Activations Density 0.301%