INDEX
Explanations
electric, electrical, electricity
New Auto-Interp
Negative Logits
ين
1.15
ای
1.05
ം
1.05
리
0.98
жи
0.96
au
0.95
auxqu
0.95
การ
0.94
م
0.94
ใน
0.94
POSITIVE LOGITS
↵↵
1.00
0
0.98
on
0.93
electric
0.85
it
0.84
electro
0.84
sen
0.83
electricity
0.82
this
0.82
not
0.81
Activations Density 0.044%