INDEX
Explanations
references to robots or robotic-related terms
references to robots
New Auto-Interp
Negative Logits
clair
-0.73
WAYS
-0.70
dn
-0.70
creen
-0.70
Beg
-0.70
uity
-0.68
ippi
-0.66
retion
-0.65
reens
-0.65
uating
-0.65
POSITIVE LOGITS
ically
0.86
robot
0.82
anical
0.80
mascot
0.74
bots
0.72
swarm
0.72
robots
0.71
Robots
0.70
gri
0.70
ragon
0.69
Activations Density 0.027%