INDEX
Explanations
references to visual aids like figures, diagrams, and charts in a technical context
references to visual aids such as figures, diagrams, and charts
New Auto-Interp
Negative Logits
ibr
-0.66
redeemed
-0.66
omo
-0.65
ciplinary
-0.63
ocused
-0.63
ayers
-0.63
ilitation
-0.63
alties
-0.62
200000
-0.62
acer
-0.62
POSITIVE LOGITS
!).
1.06
!),
1.04
!)
1.00
-)
1.00
)!
0.99
).
0.99
).
0.98
?).
0.97
)."
0.97
)...
0.94
Activations Density 0.049%