INDEX
Explanations
references to specific individuals named "Jean" in the text
New Auto-Interp
Negative Logits
holders
-0.85
llah
-0.70
aneers
-0.67
awar
-0.66
ulnerability
-0.66
TOP
-0.65
ãĥ¼ãĥ³
-0.65
Flavoring
-0.65
BIP
-0.64
ODUCT
-0.64
POSITIVE LOGITS
ette
1.06
etta
0.97
Claude
0.96
Francois
0.92
Jacques
0.90
Baptist
0.88
ois
0.86
Cla
0.85
Pierre
0.85
François
0.84
Activations Density 0.016%