INDEX
Explanations
mentions of the keyword "Fil" or variations like "fil" in the text
mentions of a specific entity or name
New Auto-Interp
Negative Logits
lain
-0.64
compr
-0.64
Butterfly
-0.61
Stand
-0.58
eer
-0.58
Dakota
-0.58
Sovereign
-0.57
dummy
-0.57
convict
-0.57
Barnes
-0.57
POSITIVE LOGITS
igree
1.35
ters
1.28
tered
1.23
tering
1.15
thy
1.12
ming
1.09
ms
1.03
ename
0.99
cipled
0.99
terness
0.98
Activations Density 0.034%