INDEX
Explanations
key political figures or prominent names in discussions
New Auto-Interp
Head Attr Weights
0:0.03
1:0.03
2:0.11
3:0.10
4:0.17
5:0.05
6:0.08
7:0.14
8:0.05
9:0.06
10:0.08
11:0.07
Negative Logits
Material
-1.88
inventoryQuantity
-1.83
ーク
-1.66
Alias
-1.62
Quant
-1.62
Untitled
-1.58
Rat
-1.58
/**
-1.53
Basic
-1.51
Books
-1.49
POSITIVE LOGITS
versely
1.68
nces
1.64
rique
1.61
rompt
1.58
valleys
1.53
luckily
1.52
rha
1.51
jer
1.48
leep
1.48
staggered
1.47
Activations Density 0.000%