INDEX
Explanations
instances of the word "determine" and its variations, indicating a focus on decision-making or assessment processes
New Auto-Interp
Negative Logits
oil
-0.76
athi
-0.73
talking
-0.72
amon
-0.69
rams
-0.68
ublic
-0.67
jan
-0.67
itual
-0.65
repe
-0.65
ovie
-0.65
POSITIVE LOGITS
whether
1.13
eligibility
0.95
how
0.92
whether
0.85
validity
0.84
exactly
0.82
paternity
0.82
thresholds
0.82
beforehand
0.75
optimal
0.74
Activations Density 0.033%