INDEX
Explanations
terms associated with roles and functions in organizational or diplomatic contexts
New Auto-Interp
Negative Logits
lian
-0.15
ening
-0.14
idy
-0.14
melon
-0.14
mel
-0.14
ollo
-0.14
ishments
-0.14
_VC
-0.14
dump
-0.14
itan
-0.14
POSITIVE LOGITS
-agent
0.17
capped
0.17
provoc
0.17
cap
0.16
ëĭĺ
0.15
agent
0.15
bidden
0.14
oven
0.14
.changed
0.14
zy
0.14
Activations Density 0.105%