INDEX
Explanations
verbs related to actions or behaviors
verbs indicating actions or behaviors
New Auto-Interp
Negative Logits
Campaign
-0.65
ãĥķãĤ©
-0.65
crim
-0.61
Applic
-0.60
ãĤ±
-0.59
berus
-0.59
APPLIC
-0.59
winner
-0.58
theless
-0.58
affer
-0.57
POSITIVE LOGITS
theirs
0.91
themselves
0.84
alike
0.78
ingly
0.74
their
0.70
eth
0.67
parach
0.65
Hitchcock
0.64
aldehyde
0.61
oba
0.61
Activations Density 0.422%