INDEX
Explanations
references to the comedy group Monty Python
instances of the letter 'y' and its variations in words
New Auto-Interp
Negative Logits
Painter
-0.66
ratulations
-0.63
drm
-0.62
instein
-0.61
uld
-0.60
redo
-0.60
attribution
-0.60
rison
-0.59
Reviewer
-0.58
torrent
-0.58
POSITIVE LOGITS
rules
0.76
»Ĵ
0.70
rius
0.70
uve
0.70
Abbey
0.66
arios
0.65
oris
0.63
omers
0.63
ository
0.61
IQ
0.61
Activations Density 0.101%