INDEX
Explanations
expressions related to personal expectations and experiences
New Auto-Interp
Negative Logits
minimise
-0.21
recognise
-0.20
armour
-0.20
recognised
-0.20
rumours
-0.19
realise
-0.19
manoe
-0.18
colourful
-0.18
specialised
-0.18
777
-0.18
POSITIVE LOGITS
pop
0.23
bin
0.23
Pop
0.22
Pop
0.20
popped
0.20
pop
0.19
popping
0.19
lia
0.19
flick
0.19
(pop
0.18
Activations Density 0.501%