INDEX
Explanations
words related to objects and items that are typically used by humans
words that end with specific suffixes
New Auto-Interp
Negative Logits
earable
-0.83
RM
-0.70
Reps
-0.69
VR
-0.69
ukong
-0.68
utenberg
-0.67
ngth
-0.67
Expend
-0.64
itu
-0.62
ETH
-0.61
POSITIVE LOGITS
beetle
0.91
beetles
0.83
sticks
0.81
balls
0.79
circle
0.77
paddle
0.76
gland
0.76
herd
0.76
lect
0.76
fences
0.75
Activations Density 0.237%