INDEX
Explanations
words related to uninhabitable places
words related to habitation and living conditions
New Auto-Interp
Negative Logits
peel
-0.69
Stack
-0.68
Pikachu
-0.64
donations
-0.63
compliments
-0.62
whites
-0.62
RT
-0.62
opaque
-0.61
tip
-0.61
charges
-0.60
POSITIVE LOGITS
hab
4.89
habi
2.22
Hab
1.31
hib
1.21
ha
1.12
Habit
1.08
hal
1.08
kef
1.07
hum
1.07
hy
1.04
Activations Density 0.004%