INDEX
Explanations
references to specific names or entities
references to lists or rankings typically in a prominent format
New Auto-Interp
Negative Logits
Wil
-1.02
Bows
-0.96
WB
-0.95
Webb
-0.91
ws
-0.89
Woo
-0.88
SW
-0.87
wi
-0.87
Hels
-0.86
Wes
-0.86
POSITIVE LOGITS
ent
0.99
card
0.91
cards
0.91
parser
0.88
ents
0.83
ENT
0.82
Gall
0.82
ñ
0.81
ental
0.80
phil
0.79
Activations Density 0.407%