INDEX
Explanations
occurrences of the letter "o"
New Auto-Interp
Negative Logits
etheless
-0.75
ump
-0.69
think
-0.64
hold
-0.62
places
-0.62
leaf
-0.61
umbered
-0.61
idget
-0.60
body
-0.60
ifference
-0.60
POSITIVE LOGITS
================================================================
0.74
Finished
0.68
Desk
0.67
ewitness
0.67
rogen
0.66
lymp
0.66
£
0.65
vation
0.64
Results
0.63
gres
0.63
Activations Density 0.012%