INDEX
Explanations
proper nouns or names
words associated with arguments or debates
New Auto-Interp
Negative Logits
staking
-0.75
corrid
-0.61
attribution
-0.56
DEV
-0.56
patched
-0.55
mainline
-0.55
mosaic
-0.55
pandemonium
-0.54
ashore
-0.54
disinfect
-0.54
POSITIVE LOGITS
iy
0.95
ai
0.92
iri
0.87
icz
0.87
angan
0.85
oro
0.84
oy
0.84
iaz
0.84
ua
0.83
aru
0.83
Activations Density 0.310%