INDEX
Explanations
phrases emphasizing the significance of a concept or topic
New Auto-Interp
Negative Logits
érrez
-0.55
遇
-0.55
虚
-0.47
zh
-0.45
dụ
-0.44
虛
-0.44
Big
-0.44
big
-0.44
con
-0.43
incremental
-0.42
POSITIVE LOGITS
Extension
0.98
ReusableCell
0.97
extension
0.94
importance
0.90
ParallelGroup
0.86
importance
0.86
extension
0.82
Extension
0.79
Importance
0.77
กัน
0.76
Activations Density 0.060%