INDEX
Explanations
references to natural environments like wilderness
references to wilderness or natural environments
New Auto-Interp
Negative Logits
abolic
-0.74
odor
-0.71
akin
-0.71
eon
-0.69
abol
-0.67
orah
-0.67
rupulous
-0.67
Americans
-0.67
mus
-0.65
sei
-0.64
POSITIVE LOGITS
wilderness
1.19
erness
1.13
Wilderness
1.06
wasteland
0.88
ocene
0.88
paradise
0.81
frontier
0.80
pedia
0.75
Survival
0.75
jungle
0.75
Activations Density 0.006%