INDEX
Explanations
personal actions and belongings
references to specific items or features related to personal experiences and interactions
New Auto-Interp
Negative Logits
ANS
-0.80
anas
-0.72
AN
-0.72
AN
-0.71
emanc
-0.70
Lex
-0.70
ACTIONS
-0.70
Austral
-0.69
TPPStreamerBot
-0.68
accompan
-0.68
POSITIVE LOGITS
box
1.05
knob
0.93
disk
0.92
naire
0.92
Grid
0.90
grid
0.90
Box
0.90
table
0.89
tablet
0.87
tube
0.87
Activations Density 0.367%