INDEX
Explanations
uppercase words or proper names
uppercase letters and specific letter combinations
New Auto-Interp
Negative Logits
Kush
-0.61
NEWS
-0.55
DX
-0.55
VL
-0.54
yip
-0.52
enegger
-0.52
+.
-0.51
liking
-0.51
FISA
-0.51
VII
-0.50
POSITIVE LOGITS
acies
0.67
anism
0.64
acious
0.63
agons
0.61
atism
0.60
angles
0.60
itized
0.58
icial
0.58
istically
0.58
igroup
0.58
Activations Density 0.214%