INDEX
Explanations
phrases that introduce or discuss a topic
references to the concept of 'light' in various contexts
New Auto-Interp
Negative Logits
irlf
-0.79
apego
-0.79
Legisl
-0.77
Merrill
-0.77
remlin
-0.74
Forever
-0.72
ihu
-0.70
Carnegie
-0.67
UGE
-0.67
irie
-0.66
POSITIVE LOGITS
enment
1.12
ening
1.02
weights
0.94
ened
0.92
ener
0.87
bulb
0.86
eners
0.86
nings
0.80
heartedly
0.79
hearted
0.79
Activations Density 0.016%