INDEX
Explanations
words related to isolation or being separate from others
references to the concept of isolation
New Auto-Interp
Negative Logits
orah
-0.93
enegger
-0.84
vous
-0.77
rote
-0.76
ãĥ£
-0.74
igel
-0.69
herty
-0.68
Benz
-0.67
eor
-0.66
rouse
-0.64
POSITIVE LOGITS
isolation
1.19
wards
0.83
olation
0.80
confinement
0.78
ism
0.77
uous
0.76
plat
0.73
isol
0.73
yip
0.72
ively
0.69
Activations Density 0.009%