INDEX
Explanations
references to a specific individual, likely in a historical or narrative context
New Auto-Interp
Head Attr Weights
0:0.09
1:0.09
2:0.08
3:0.07
4:0.09
5:0.07
6:0.08
7:0.07
8:0.08
9:0.08
10:0.07
11:0.08
Negative Logits
idis
-2.08
Streamer
-2.07
apter
-2.07
hari
-2.07
add
-1.97
resa
-1.96
poke
-1.94
atalie
-1.89
iliate
-1.87
Danielle
-1.86
POSITIVE LOGITS
olation
2.02
bombed
2.02
populated
2.00
)).
1.92
olated
1.91
Hits
1.87
hits
1.85
Built
1.83
brut
1.82
Secondly
1.78
Activations Density 0.000%