INDEX
Explanations
phrases related to evidence analysis and discussion
references to scientific evidence
New Auto-Interp
Negative Logits
quer
-0.71
throats
-0.71
otom
-0.69
jug
-0.69
cer
-0.66
skill
-0.66
ategory
-0.65
ttle
-0.65
cise
-0.63
Hop
-0.61
POSITIVE LOGITS
evidence
0.98
evidence
0.96
Evidence
0.94
Evidence
0.90
fulness
0.77
orial
0.76
suggests
0.76
evid
0.75
validity
0.75
proof
0.75
Activations Density 0.023%