INDEX
Explanations
statements involving personal beliefs and opinions
I think, I believe, possibly
New Auto-Interp
Negative Logits
enfans
-0.52
-0.52
houſe
-0.47
ſtand
-0.47
neſs
-0.47
Sénat
-0.46
ModelRenderer
-0.46
Houſe
-0.45
erçe
-0.45
purpoſe
-0.44
POSITIVE LOGITS
rungsseite
0.53
?)
0.50
?).
0.49
probably
0.47
Probably
0.46
](#
0.45
maybe
0.45
Probably
0.45
Pretty
0.43
(?)
0.43
Activations Density 0.031%