INDEX
Explanations
geographical locations, specifically named regions
references to the name "Far" in various contexts
New Auto-Interp
Negative Logits
sburgh
-0.91
ettings
-0.79
vironment
-0.76
ichick
-0.75
ysis
-0.67
İĭ
-0.67
essee
-0.67
Kinn
-0.63
Socrates
-0.63
ificial
-0.62
POSITIVE LOGITS
riers
0.92
rier
0.91
ouk
0.87
az
0.87
rak
0.86
aday
0.85
thing
0.85
rer
0.84
ghan
0.83
rug
0.83
Activations Density 0.015%