INDEX
Explanations
percentages expressed in parentheses
percentages expressed in parentheses
New Auto-Interp
Negative Logits
zbollah
-0.71
uran
-0.70
artif
-0.70
tyrann
-0.67
shack
-0.66
imus
-0.65
princ
-0.65
mus
-0.64
oria
-0.64
Syd
-0.64
POSITIVE LOGITS
theless
0.87
ecided
0.78
osuke
0.78
erest
0.77
okers
0.75
Decay
0.72
+)
0.67
Unsure
0.67
TPS
0.66
apest
0.66
Activations Density 0.018%