INDEX
Explanations
expressions of interest or curiosity in various subjects
New Auto-Interp
Negative Logits
Caw
-0.58
Mahl
-0.55
machines
-0.53
cages
-0.52
Goy
-0.52
giù
-0.51
Machines
-0.51
Swain
-0.51
máquina
-0.50
Malk
-0.50
POSITIVE LOGITS
Interest
1.14
Interest
1.04
interest
1.03
interest
1.02
Interests
0.90
INTEREST
0.88
INTEREST
0.88
interests
0.81
interested
0.81
Interested
0.81
Activations Density 0.107%