INDEX
Explanations
actions related to waving or indicating through hand gestures
actions related to waving or signaling
New Auto-Interp
Negative Logits
defic
-0.72
ãĥ¯ãĥ³
-0.71
unker
-0.70
Ranked
-0.68
д
-0.67
Colleges
-0.63
igne
-0.62
Forced
-0.60
ãĤ·ãĥ£
-0.58
ESS
-0.58
POSITIVE LOGITS
goodbye
1.19
lasses
0.96
hello
0.85
waving
0.82
uay
0.79
wheel
0.79
chairs
0.77
waved
0.77
torches
0.76
frantically
0.76
Activations Density 0.027%