INDEX
Explanations
occurrences of actions or verbs related to receiving, going, and interacting
instances in the text where actions are being described
New Auto-Interp
Negative Logits
WER
-0.63
never
-0.63
Calculator
-0.62
Moroc
-0.62
spared
-0.61
bay
-0.58
ACP
-0.57
;;;;;;;;
-0.57
Neh
-0.56
Fuller
-0.56
POSITIVE LOGITS
ibrary
0.73
adelphia
0.72
GROUP
0.65
olded
0.65
esta
0.64
irtual
0.61
icles
0.60
lict
0.60
somebody
0.58
solo
0.58
Activations Density 0.305%