INDEX
Explanations
proper nouns related to individuals and government figures
New Auto-Interp
Negative Logits
":[
-0.66
':
-0.66
ascus
-0.65
Interest
-0.64
.....
-0.61
ciplinary
-0.61
.......
-0.61
"))
-0.61
â̦..
-0.58
hend
-0.58
POSITIVE LOGITS
!).
1.24
?).
1.22
!),
1.12
?),
1.02
!)
0.97
?)
0.95
).
0.88
).
0.88
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
0.83
)?
0.82
Activations Density 0.817%