INDEX
Explanations
names of individuals
repeated mentions of the term "Far."
New Auto-Interp
Negative Logits
sburgh
-0.89
ually
-0.71
Kinn
-0.69
wrench
-0.67
vironment
-0.63
urally
-0.63
isance
-0.59
resid
-0.59
IBLE
-0.58
ettings
-0.58
POSITIVE LOGITS
aday
1.05
ouk
1.04
riers
1.04
rier
1.00
agher
0.97
rer
0.92
thing
0.91
inas
0.89
abee
0.89
rah
0.86
Activations Density 0.023%