INDEX
Explanations
expressions related to joining or participating in activities or organizations
New Auto-Interp
Negative Logits
za
-0.15
agna
-0.15
enga
-0.15
zza
-0.15
uda
-0.15
/of
-0.15
ToLower
-0.15
bye
-0.14
çĦ¶
-0.14
ynch
-0.14
POSITIVE LOGITS
forces
0.60
forces
0.47
Forces
0.43
ranks
0.38
force
0.36
hands
0.33
forced
0.27
force
0.27
forced
0.25
efforts
0.25
Activations Density 0.027%