INDEX
Explanations
occurrences of the word "mere" with high activation values
New Auto-Interp
Negative Logits
Also
-0.64
illas
-0.63
Locked
-0.63
destro
-0.63
smartest
-0.62
anwhile
-0.61
accordingly
-0.61
leneck
-0.61
extensively
-0.61
multipl
-0.60
POSITIVE LOGITS
mortals
1.30
ifiable
0.82
superficial
0.78
incidental
0.77
curs
0.77
curiosity
0.77
ton
0.76
mortal
0.75
coincidence
0.74
consolation
0.72
Activations Density 0.027%