INDEX
Explanations
light and light-related phrases
New Auto-Interp
Negative Logits
0.47
0.43
0.41
lesson
0.39
的看着
0.39
0.39
0.38
лекар
0.38
প্লে
0.38
atypes
0.37
POSITIVE LOGITS
bulb
0.88
bulb
0.78
light
0.77
enment
0.70
bulbs
0.70
ened
0.68
Bulb
0.68
terang
0.67
Light
0.66
hearted
0.66
Activations Density 0.052%