INDEX
Explanations
specific examples or instances within a context or scenario
phrases that introduce examples or case studies
New Auto-Interp
Negative Logits
inately
-0.80
essed
-0.77
ournal
-0.76
priority
-0.70
istance
-0.68
ibles
-0.68
imately
-0.67
ief
-0.65
quire
-0.62
accessory
-0.62
POSITIVE LOGITS
illustrate
1.05
Suppose
1.01
illustrates
0.98
Example
0.98
illustrating
0.95
illust
0.91
example
0.90
Example
0.90
examples
0.80
Examples
0.78
Activations Density 0.231%