INDEX
Explanations
mentions of specific mountain ranges
references to mountains
New Auto-Interp
Negative Logits
tle
-0.87
lich
-0.72
NER
-0.70
cer
-0.66
cci
-0.66
Cam
-0.65
PAR
-0.65
OUT
-0.64
··
-0.63
Labor
-0.63
POSITIVE LOGITS
mountains
1.06
Mountains
0.98
biome
0.81
dale
0.81
footh
0.79
challeng
0.79
lake
0.79
valley
0.78
valleys
0.78
terday
0.78
Activations Density 0.012%