INDEX
Explanations
references to a specific brand of beer
New Auto-Interp
Negative Logits
liest
-0.91
ulatory
-0.86
ulates
-0.85
angular
-0.84
quo
-0.83
ulating
-0.79
rous
-0.79
ulator
-0.75
rely
-0.74
ulators
-0.72
POSITIVE LOGITS
Miller
0.85
Mayhem
0.81
Lite
0.80
ophon
0.75
iday
0.74
quist
0.74
beer
0.73
myra
0.71
hound
0.70
burgh
0.70
Activations Density 0.007%