INDEX
Explanations
terms related to physics and scientific discourse
New Auto-Interp
Negative Logits
oway
-0.82
HCR
-0.79
illard
-0.72
ibaba
-0.72
ilee
-0.71
ged
-0.70
rations
-0.70
thumbnails
-0.70
ovo
-0.70
strom
-0.70
POSITIVE LOGITS
physicists
1.11
physicist
1.10
physics
0.99
sciences
0.84
equations
0.83
simulation
0.80
ERN
0.79
ysics
0.79
puzz
0.78
Physics
0.77
Activations Density 0.008%