INDEX
Explanations
moments of resolution or completion in narratives
finally and concluding events
New Auto-Interp
Negative Logits
ſelf
-0.72
<<<<<<<<<<<<<<
-0.71
ſeveral
-0.70
ſelves
-0.65
leſs
-0.65
pleaſure
-0.63
myſelf
-0.62
يميديا
-0.62
onely
-0.62
zelfde
-0.60
POSITIVE LOGITS
finally
1.04
Finally
0.98
finally
0.94
Finally
0.94
FINALLY
0.85
终于
0.76
finalmente
0.75
enfin
0.73
終於
0.73
Finalmente
0.72
Activations Density 0.108%