INDEX
Explanations
negative contractions and phrases expressing disbelief or rejection
New Auto-Interp
Negative Logits
realization
-0.20
realise
-0.18
realizes
-0.16
realizing
-0.16
irk
-0.15
realize
-0.15
hod
-0.15
ureka
-0.14
umer
-0.14
osen
-0.13
POSITIVE LOGITS
recall
0.26
think
0.25
recalling
0.21
think
0.21
remember
0.20
recall
0.20
fault
0.20
recalled
0.20
recalls
0.19
myself
0.19
Activations Density 0.059%