INDEX
Explanations
instances where a comparison or decision between multiple options is being made
occurrences of the word "which."
New Auto-Interp
Negative Logits
GROUND
-0.81
Balt
-0.73
Glob
-0.72
Rog
-0.71
rise
-0.71
Adams
-0.71
athi
-0.70
rene
-0.69
ben
-0.66
inence
-0.64
POSITIVE LOGITS
kinds
0.84
sorts
0.82
contingency
0.75
soever
0.71
iple
0.70
flavors
0.69
delim
0.68
types
0.67
redes
0.66
species
0.64
Activations Density 0.052%