INDEX
Explanations
instructions related to physical exercises or technical tasks
New Auto-Interp
Negative Logits
akings
-0.80
Rum
-0.74
Flavoring
-0.73
Diesel
-0.68
Reloaded
-0.66
Trouble
-0.63
ologne
-0.63
ivals
-0.63
rumors
-0.63
suspicions
-0.62
POSITIVE LOGITS
perpendicular
1.53
angled
1.33
horizontal
1.27
downwards
1.26
perpend
1.21
horizontally
1.20
axis
1.16
diagonal
1.16
vertical
1.15
vertically
1.15
Activations Density 0.460%