INDEX
Explanations
results or outcomes
references to outcomes or consequences of events
New Auto-Interp
Negative Logits
undai
-0.87
eways
-0.78
tradem
-0.76
mens
-0.74
icrobial
-0.72
neau
-0.69
spots
-0.69
ctic
-0.69
cot
-0.68
thur
-0.67
POSITIVE LOGITS
result
1.05
result
0.90
Result
0.88
Results
0.85
enance
0.83
iments
0.77
results
0.76
Enh
0.76
results
0.75
iment
0.70
Activations Density 0.023%