INDEX
Explanations
phrases related to scientific research and experimental results
references to outcomes or findings
New Auto-Interp
Negative Logits
resent
-0.71
yne
-0.66
Passage
-0.65
vigil
-0.65
actionGroup
-0.61
rainy
-0.61
vows
-0.59
maj
-0.59
capital
-0.58
kov
-0.57
POSITIVE LOGITS
results
0.91
results
0.88
iments
0.85
Results
0.81
iveness
0.80
Results
0.79
ĸļ
0.76
ample
0.74
Panel
0.73
amples
0.73
Activations Density 0.019%