INDEX
Explanations
phrases related to giving instructions or providing information
elements related to personal experiences and reflections
New Auto-Interp
Negative Logits
.","
-0.80
sed
-0.67
','
-0.62
arsity
-0.61
""
-0.59
ãĥı
-0.56
ãĥ´
-0.56
norm
-0.56
âĢķ
-0.56
sen
-0.54
POSITIVE LOGITS
Conclusion
1.32
Lastly
1.30
Lastly
1.18
concludes
1.10
Finally
1.02
Finally
0.99
Conclusion
0.96
CONCLUS
0.95
Summary
0.94
concluding
0.93
Activations Density 0.414%